On 6/13/25 4:46 AM, Sairaj Kodilkar wrote:


On 5/31/2025 3:00 AM, Alejandro Jimenez wrote:
Hey Sairaj,

On 5/29/25 2:16 AM, Sairaj Kodilkar wrote:


On 5/2/2025 7:46 AM, Alejandro Jimenez wrote:
On reset, restore the default address translation mode for all the
address spaces managed by the vIOMMU.

Signed-off-by: Alejandro Jimenez <alejandro.j.jime...@oracle.com>
---
  hw/i386/amd_iommu.c | 28 ++++++++++++++++++++++++++++
  1 file changed, 28 insertions(+)

diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
index 71018d70dd10..90491367594b 100644
--- a/hw/i386/amd_iommu.c
+++ b/hw/i386/amd_iommu.c
@@ -962,6 +962,33 @@ static void amdvi_switch_address_space(AMDVIAddressSpace *amdvi_as)
      }
  }
+/*
+ * For all existing address spaces managed by the IOMMU, enable/ disable the + * corresponding memory regions depending on the address translation mode
+ * as determined by the global and individual address space settings.
+ */
+static void amdvi_switch_address_space_all(AMDVIState *s)
+{
+    AMDVIAddressSpace **iommu_as;
+
+    for (int bus_num = 0; bus_num < PCI_BUS_MAX; bus_num++) {
+
+        /* Nothing to do if there are no devices on the current bus */
+        if (!s->address_spaces[bus_num]) {
+            continue;
+        }
+        iommu_as = s->address_spaces[bus_num];
+
+        for (int devfn = 0; devfn < PCI_DEVFN_MAX; devfn++) {
+
+            if (!iommu_as[devfn]) {
+                continue;
+            }
+            amdvi_switch_address_space(iommu_as[devfn]);
+        }
+    }
+}
+
  /* log error without aborting since linux seems to be using reserved bits */
  static void amdvi_inval_devtab_entry(AMDVIState *s, uint64_t *cmd)
  {
@@ -2199,6 +2226,7 @@ static void amdvi_sysbus_reset(DeviceState *dev)
      /* Discard all mappings on device reset */
      amdvi_address_space_unmap_all(s);
+    amdvi_switch_address_space_all(s);

Hi Alejandro

I think amdvi_sysbus_reset should set iommu_as->addr_translation flag to
"false" before switching all the address spaces. Without this, the
devices will keep using IOMMU address space.

My first impulse is to agree with you, from the standpoint of considering the no_dma mode as the "default mode", and a reset should bring us back to default. But I wonder that is necessarily the architectural behavior...

After a reset, in order for any device transactions to be processed, a guest driver must reinitialize the IOMMU data structures, including the Device Table and specifically the table entry for the device. That must trigger a INVAL_DEVTAB_ENTRY that will be intercepted and as-  >addr_translation will be set correctly. If the guest driver doesn't do these operations, then a device won't be able to use the IOMMU because it doesn't have a valid DTE, right? The earlier mappings were already dropped, so it doesn't affect the host.

Again, I see your point, and making this change is likely the right thing to do, which is why I'll make the change for v3. Just wondering if implementing such behavior is actually architecturally accurate or just the "common sense" approach...

Thank you for your attention to detail and all the valuable feedback. I will be out next week, and will send v3 once I am back online.

Alejandro


Hey

I tested current patches on OVMF and reboot crashes with IO_PAGE_FAULT
logs on _host_. But setting "addr_translation=false" fixes this crash.


That is odd, I have been using OVMF since the beginning with no issues, so I'd have to ask you more questions about your setup to figure out what the difference is since I cannot reproduce this specific issue.

I think the reason is that, kernel does not reset DTE while shutting
down the system. This keeps the stale mappings (IOVA->SPA) still in the
IOMMU and OVMF initiates some DMA operations on device before guest
reboots and sets up the new mappings.


On reset, the call to

amdvi_sysbus_reset()
        amdvi_address_space_unmap_all()
                amdvi_address_space_unmap()

is intended to avoid this problem.
I have found that on reboot AND when using forcedac=1, there are mappings for the upper end of the IOVA space that are not dropped (this happens after fixing the integer overflow issue you pointed out). I am looking into it now, and once it is fixed I'd appreciate it if you could try to reproduce this scenario again, as the underlying cause might be the same.

Alejandro

Thanks
Sairaj

Regards
Sairaj Kodilkar
  }
  static void amdvi_sysbus_realize(DeviceState *dev, Error **errp)





Reply via email to