On 2025/6/11 18:06, Duan, Zhenzhong wrote:


-----Original Message-----
From: Liu, Yi L <yi.l....@intel.com>
Subject: Re: [PATCH v1 02/15] intel_iommu: Optimize context entry cache
utilization

On 2025/6/6 18:04, Zhenzhong Duan wrote:
There are many call sites referencing context entry by calling
vtd_dev_to_context_entry() which will traverse the DMAR table.

In most cases we can use cached context entry in vtd_as->context_cache_entry
except when its entry is stale. Currently only global and domain context
invalidation stale it.

So introduce a helper function vtd_as_to_context_entry() to fetch from cache
before trying with vtd_dev_to_context_entry().

The cached context entry is now protected by vtd_iommu_lock(). While not
all caller of vtd_dev_to_context_entry() are under this lock.

Also, the cached context entry is created in the translate path. IMHO,
this path is not supposed to be triggered for passthrough devices.
While this may need double check and may change in the future. But let's
see if any locking issue with the current code.

Good finding, yes.
Previously I thought translation path updates cc_entry->context_entry after 
cc_entry->context_cache_gen.
In vtd_as_to_context_entry() cc_entry->context_cache_gen is checked first, so 
there was no real race.
But I still missed a memory barrier like below:

yeah, testing context_cache_gen is necessary. But without lock, this
cannot guarantee the cc_entry is valid after the test.

@@ -2277,6 +2286,7 @@ static bool vtd_do_iommu_translate(VTDAddressSpace 
*vtd_as, PCIBus *bus,
                                    cc_entry->context_cache_gen,
                                    s->context_cache_gen);
          cc_entry->context_entry = ce;
+        smp_wmb();
          cc_entry->context_cache_gen = s->context_cache_gen;
      }
Another option I can think of is adding lock to cache reading like below:

this is in-enough as well since the cc_entry->context_entry can be modified
after lock is released.

@@ -1659,11 +1659,15 @@ static int vtd_as_to_context_entry(VTDAddressSpace 
*vtd_as, VTDContextEntry *ce)
      uint8_t devfn = vtd_as->devfn;
      VTDContextCacheEntry *cc_entry = &vtd_as->context_cache_entry;

+    vtd_iommu_lock(s);
+
      /* Try to fetch context-entry from cache first */
      if (cc_entry->context_cache_gen == s->context_cache_gen) {
          *ce = cc_entry->context_entry;
+        vtd_iommu_unlock(s);
          return 0;
      } else {
+        vtd_iommu_unlock(s);
          return vtd_dev_to_context_entry(s, bus_num, devfn, ce);
      }
  }

Which one do you prefer?

If it's just optimization, perhaps just drop it. :)

--
Regards,
Yi Liu

Reply via email to