On Thu, Jan 22, 2026 at 09:58:04PM +1100, Alexey Kardashevskiy wrote: > > This issue with the RMP is no different, if you get a 2M IOPTE then > > the HW should check the RMP and load in a 4K IOPTE to the IOTLB if > > that is what the RMP requires. > > That the HW doesn't do that means you have all these difficult > > problems. > > Got it. Interestingly the HW actually does that, almost. Say, for > >=2MB IO pages it checks if RMP==2MB and puts a 2MB IO TLB entry if > RMP==2MB, and for 4KB..1MB IO pages - a 4K IO TLB entry and RMP==4K > check. But it does not cross the 2MB boundary in RMP. Uff :-/
Not sure I understand this limitation, how does any aligned size cross a 2MB boundary? Sounds like it was thought about, is it a HW bug some cases don't work? > on the other hand, without swiotlb, dma_map() in the guest for > untrusted device is likely to be lot less than 2MB and going to > share another handful of pages but this activity is not that rare > compared to my certificates example. If only there was a way to > somehow bundle such allocations/mappings... :-/ ARM is pushing a thing where encrypt/decrypt has to work on certain aligned granual sizes > PAGE_SIZE, you could use that mechanism to select a 2M size for AMD too and avoid this. > > That's a completely grotesque solution! > > > > It violates all of our software layers. The IOMMU and RMP are not > > controled by the same software entity and you propose to have a FW > > call that edits *both* together somehow? How is that even going to > > work safely? > > > > Can't you do things in a sequence? > > > > Change the iommu from 2M to 4K, flush, then change the RMP from 2M to > > 4K? > > Sure we could unless there is ongoing DMA between "flush" and "then > change" and then DMA will fail because of mismatching page sizes > (that 2MB crossing thing above). I'm confused, if the IOMMU has 4K and the RMP has 2M it doesn't work? Then why was I told the 4k page size kernel parameter fixes everything? What happens if the guest puts 4K pages into it's AMDv2 table and RMP is 2M? > > > If I get it right, for other platforms, the entire IOMMU table is > > > going to live in a secure space so there will be similar FW calls so > > > it is not that different. > > > > At least ARM the iommu S2 table is in secure memory and the secure FW > > keeps it 1:1 with the KVM S2 table. So edits to the KVM automatically > > make matching edits to the IOMMU. Only one software layer is > > responsible for things. ? > Does KVM talk to the host IOMMU code for that (and then the IOMMU code calls > the secure world)? > Or KVM goes straight to that secure world? Straight to the secure world, there is no host IOMMU driver for the secure IOMMU. > Is the host IOMMU code aware of the content of the secure IOMMU table? No, it isn't even aware it exist. > Does 2MB->4K smashing exist on ARM at all? Every arch has cases where larger mappings need to be reduced to smaller ones, but ARM doesn't require synchronized coordination between multiple tables. Jason
