On Tue, Nov 14, 2017 at 10:52:54PM +0100, Auger Eric wrote: [...]
> I meant, in the current intel_iommu code, vtd_find_add_as() creates 1 > IOMMU MR and 1 AS per PCIe device, right? I think this is the most tricky point - in QEMU IOMMU MR is not really a 1:1 relationship to devices. For Intel, it's true; for Power, it's not. On Power guests, one device's DMA address space can be splited into different translation windows, while each window corresponds to one IOMMU MR. So IMHO the real 1:1 mapping is between the device and its DMA address space, rather than MRs. It's been a long time since when I drafted the patches. I think at least that should be a more general notifier mechanism comparing to current IOMMUNotifier thing, which was bound to IOTLB notifies only. AFAICT if we want to trap first-level translation changes, current notifier is not even close to that interface - just see the definition of IOMMUTLBEntry, it is tailored only for MAP/UNMAP of translation addresses, not anything else. And IMHO that's why it's tightly bound to MemoryRegions, and that's the root problem. The dynamic IOMMU MR switching problem is related to this issue as well. I am not sure current "get IOMMU object from address space" solution would be best, maybe it's "too bigger a scope", I think it depends on whether in the future we'll have some requirement in such a bigger scope (say, something we want to trap from vIOMMU and deliver it to host IOMMU which may not even be device-related? I don't know). Now another alternative I am thinking is, whether we can provide a per-device notifier, then it can be bound to PCIDevice rather than MemoryRegions, then it will be in device scope. Thanks, -- Peter Xu