Hi Sairaj Kodilkar,

On 3/20/25 1:11 AM, Arun Kodilkar, Sairaj wrote:


On 3/11/2025 8:54 PM, Alejandro Jimenez wrote:
The AMD I/O Virtualization Technology (IOMMU) Specification (see Table 8: V,
TV, and GV Fields in Device Table Entry), specifies that a DTE with V=0,
TV=1 does not contain a valid address translation information.  If a request
requires a table walk, the walk is terminated when this condition is
encountered.

Do not assume that addresses for a device with DTE[TV]=0 are passed through
(i.e. not remapped) and instead terminate the page table walk early.

Cc: qemu-sta...@nongnu.org
Fixes: d29a09ca6842 ("hw/i386: Introduce AMD IOMMU")
Signed-off-by: Alejandro Jimenez <alejandro.j.jime...@oracle.com>
---
  hw/i386/amd_iommu.c | 88 +++++++++++++++++++++++++--------------------
  1 file changed, 49 insertions(+), 39 deletions(-)

diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
index cf00450ebe..31d5522a62 100644
--- a/hw/i386/amd_iommu.c
+++ b/hw/i386/amd_iommu.c
@@ -932,51 +932,61 @@ static void amdvi_page_walk(AMDVIAddressSpace *as, uint64_t *dte,
      uint64_t pte = dte[0], pte_addr, page_mask;
      /* make sure the DTE has TV = 1 */
-    if (pte & AMDVI_DEV_TRANSLATION_VALID) {
-        level = get_pte_translation_mode(pte);
-        if (level >= 7) {
-            trace_amdvi_mode_invalid(level, addr);
+    if (!(pte & AMDVI_DEV_TRANSLATION_VALID)) {
+        /*
+         * A DTE with V=1, TV=0 does not have a valid Page Table Root Pointer. +         * An IOMMU processing a request that requires a table walk terminates +         * the walk when it encounters this condition. Do the same and return +         * instead of assuming that the address is forwarded without translation +         * i.e. the passthrough case, as it is done for the case where DTE[V]=0.
+         */
Hi Alejandro,
According to AMD IOMMU specs TABLE 44 (IO_PAGE_FAULT Event Types), IOMMU
reports IO_PAGE_FAULT event when TV bit is not set in the DTE.

Hence you should use amdvi_page_fault to report the IO_PAGE_FAULT
event before returning.

Good point, I had not considered this (and really haven't paid attention to the page fault reporting until now). On first impression, I tend to agree with your observation and will include in on v2. That being said...

The current role of amdvi_page_walk() is to be a helper for the translate() method and ultimately the IOMMU replay() functionality. These are needed for emulation but do not necessarily correspond 1:1 with guest-initiated operations. e.g. VFIO code will call memory_region_iommu_replay() (which ends up calling amdvi_walk_page()) when syncing the dirty bitmap or after registering a new notifier. The guest would get IO_PAGE_FAULT events for all the regions where a mapping doesn't currently exist, which doesn't seem correct.

IOW, I think even this existing call to amdvi_page_fault() is not necessary/correct:

    do {
        pte_perms = amdvi_get_perms(pte);
        present = pte & 1;
        if (!present || perms != (perms & pte_perms)) {
            amdvi_page_fault(as->iommu_state, as->devfn, addr, perms);
            trace_amdvi_page_fault(addr);
            return;
        }
   [...]

I am aware I am jumping ahead a bit too much, but the point is that the whole event reporting likely needs a comprehensive review.

I need to study the area a lot more since even the simplest/only current use of amdvi_page_fault() right now is not very clear to me.

Alejandro

P.S.
Also confusing is this assertion in 2.5.3 IO_PAGE_FAULT Event:

"I/O page faults detected for translation requests return a translation-not-present response (R=W=0) to the device and are not logged in the event log."

so this suggests we should not emit a page fault i.e. don't log it in the event log.



Regards
Sairaj Kodilkar



Reply via email to