Hi, As far as I tested the disabled code (call to memory_region_iommu_replay) hangup QEMU on startup if IOMMU is enabled (scaning 64 bit address space takes more than an hour on modern hardware) , at least on x86 hardware. So the code is not 100% correct for any context. Maybe it just should be disabled for x86 architecture?
By specification any such behavior of applying a domain to device should include cache invalidation if CM flag is present so I'm not thinking that my patch break this scenario. Thanks, Aviv. On Sat, May 28, 2016 at 8:39 PM Alex Williamson <alex.william...@redhat.com> wrote: > On Sat, 28 May 2016 16:10:55 +0000 > "Aviv B.D." <bd.a...@gmail.com> wrote: > > > On Sat, May 28, 2016 at 7:02 PM Alex Williamson < > alex.william...@redhat.com> > > wrote: > > > > > On Sat, 28 May 2016 10:52:58 +0000 > > > "Aviv B.D." <bd.a...@gmail.com> wrote: > > > > > > > Hi, > > > > Your idea to search the relevent VTDAddressSpace and call it's > notifier > > > > will > > > > probably work. Next week I'll try to implement it (for now with the > > > costly > > > > scan > > > > of each context). > > > > > > I think an optimization we can make is to use pci_for_each_bus() and > > > pci_for_each_device() to scan only context entries where devices are > > > present. Then for each context entry, retrieve the DID, if it matches > > > the invalidation domain_id, retrieve the VTDAddressSpace and perform a > > > memory_region_notify_iommu() using VTDAddressSpace.iommu. Still > > > horribly inefficient, but an improvement over walking all context > > > entries and avoids gratuitous callbacks between unrelated drivers in > > > QEMU. > > > > > > > Thanks for the references on how I can do it. :) > > > > > > > > Overall, I have very little faith that this will be the only change > > > required to make this work though. For instance, if a device is added > > > or removed from a domain, where is that accounted for? Ideally this > > > should trigger the region_add/region_del listener callbacks, but I > > > don't see how that works with how VT-d creates a fixed VTDAddressSpace > > > per device, and in fact how our QEMU memory model doesn't allow the > > > address space of a device to be dynamically aliased against other > > > address spaces or really changed at all. > > > > > > > I still not sure if populating the MemoryRegion will suffice for hot > plug > > > > vfio > > > > device but i'll try to look into it. > > > > > > > > As far as I understand the memory_region_iommu_replay function, it > still > > > > scans > > > > the whole 64bit address space, and therefore may hang the VM for a > long > > > > time. > > > > > > Then we need to fix that problem, one option might be to make a replay > > > callback on MemoryRegionIOMMUOps that walks the page tables for a given > > > context entry rather than blindly traversing a 64bit address space. We > > > can't simply ignore the issue by #ifdef'ing out the code. I suspect > > > there's a lot more involved to make VT-d interact properly with a > > > physical device than what's been proposed so far. At every > > > invalidation, we need to figure out what's changed and update the host > > > mappings. We also need better, more dynamic address space management > > > to make the virtual hardware reflect physical hardware when we enable > > > things like passthrough mode or have multiple devices sharing an iommu > > > domain. I think we're just barely scratching the surface here. > Thanks, > > > > > > Alex > > > > > > > > > I agree with you regarding hotplug, therefore I only ifdef this code out > > and didn't > > delete it. With the call to memory_region_iommu_replay QEMU hangs on > startup > > with a very long loop that prevent any device assignment with vIOMMU > > enabled. > > > > I'm hoping not to enlarge the scope of this patch to include hotplug > device > > assignment > > with iommu enabled. > > It's not just hotplug, any case where an existing domain can be applied > to a device. The series is incomplete without such support and I won't > accept any changes into vfio that disables code that's correct in other > contexts. Thanks, > > Alex >