Hi Sairaj Kodilkar,
On 3/20/25 1:11 AM, Arun Kodilkar, Sairaj wrote:
On 3/11/2025 8:54 PM, Alejandro Jimenez wrote:
The AMD I/O Virtualization Technology (IOMMU) Specification (see Table
8: V,
TV, and GV Fields in Device Table Entry), specifies that a DTE with V=0,
TV=1 does not contain a valid address translation information. If a
request
requires a table walk, the walk is terminated when this condition is
encountered.
Do not assume that addresses for a device with DTE[TV]=0 are passed
through
(i.e. not remapped) and instead terminate the page table walk early.
Cc: qemu-sta...@nongnu.org
Fixes: d29a09ca6842 ("hw/i386: Introduce AMD IOMMU")
Signed-off-by: Alejandro Jimenez <alejandro.j.jime...@oracle.com>
---
hw/i386/amd_iommu.c | 88 +++++++++++++++++++++++++--------------------
1 file changed, 49 insertions(+), 39 deletions(-)
diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
index cf00450ebe..31d5522a62 100644
--- a/hw/i386/amd_iommu.c
+++ b/hw/i386/amd_iommu.c
@@ -932,51 +932,61 @@ static void amdvi_page_walk(AMDVIAddressSpace
*as, uint64_t *dte,
uint64_t pte = dte[0], pte_addr, page_mask;
/* make sure the DTE has TV = 1 */
- if (pte & AMDVI_DEV_TRANSLATION_VALID) {
- level = get_pte_translation_mode(pte);
- if (level >= 7) {
- trace_amdvi_mode_invalid(level, addr);
+ if (!(pte & AMDVI_DEV_TRANSLATION_VALID)) {
+ /*
+ * A DTE with V=1, TV=0 does not have a valid Page Table Root
Pointer.
+ * An IOMMU processing a request that requires a table walk
terminates
+ * the walk when it encounters this condition. Do the same
and return
+ * instead of assuming that the address is forwarded without
translation
+ * i.e. the passthrough case, as it is done for the case
where DTE[V]=0.
+ */
Hi Alejandro,
According to AMD IOMMU specs TABLE 44 (IO_PAGE_FAULT Event Types), IOMMU
reports IO_PAGE_FAULT event when TV bit is not set in the DTE.
Hence you should use amdvi_page_fault to report the IO_PAGE_FAULT
event before returning.
Good point, I had not considered this (and really haven't paid attention
to the page fault reporting until now). On first impression, I tend to
agree with your observation and will include in on v2. That being said...
The current role of amdvi_page_walk() is to be a helper for the
translate() method and ultimately the IOMMU replay() functionality.
These are needed for emulation but do not necessarily correspond 1:1
with guest-initiated operations. e.g. VFIO code will call
memory_region_iommu_replay() (which ends up calling amdvi_walk_page())
when syncing the dirty bitmap or after registering a new notifier. The
guest would get IO_PAGE_FAULT events for all the regions where a mapping
doesn't currently exist, which doesn't seem correct.
IOW, I think even this existing call to amdvi_page_fault() is not
necessary/correct:
do {
pte_perms = amdvi_get_perms(pte);
present = pte & 1;
if (!present || perms != (perms & pte_perms)) {
amdvi_page_fault(as->iommu_state, as->devfn, addr, perms);
trace_amdvi_page_fault(addr);
return;
}
[...]
I am aware I am jumping ahead a bit too much, but the point is that the
whole event reporting likely needs a comprehensive review.
I need to study the area a lot more since even the simplest/only current
use of amdvi_page_fault() right now is not very clear to me.
Alejandro
P.S.
Also confusing is this assertion in 2.5.3 IO_PAGE_FAULT Event:
"I/O page faults detected for translation requests return a
translation-not-present response (R=W=0) to the device and are not
logged in the event log."
so this suggests we should not emit a page fault i.e. don't log it in
the event log.
Regards
Sairaj Kodilkar