On Thu, 2015-09-03 at 08:33 +1000, Benjamin Herrenschmidt wrote: > On Wed, 2015-09-02 at 10:12 -0600, Alex Williamson wrote: > > > There are very specific rules for translating requester IDs across > > bridges. Bus numbers can change during enumeration, devfn cannot.
Thanks for clarifying that point, Alex, I realize I was a bit imprecise in my last mail, > > devfn can however be masked by topology changes from PCIe to PCI. If > > we pretend that the IOMMU can distinguish requester IDs where it > > can't on real hardware, we're going to break the guest. Thanks, > > Note that whether a PCI / PCI-X bridge will mask devfn, bus# or both or > even mask it partially (number of bits) or replace some transfers with > its own RID ... depends on a given bridge implementation. > > Another thing is while I agree that the bus number is problematic, > since it changes, it is still what the HW actually uses to match the > requester in practice, at least on PHB and I would think on Intel. > > The problem is more fundamental. qemu is trying to bind devices to > address spaces in a fixed way at device creation time, while this is > lazily resolved in HW at the point of the DMA occurring. So let me try to sum up my understanding in context of the patch in terms of these two approaches, > One way to fix it is to effectively have an address space per device, > and have the iommu translate function figure out the binding > dynamically and flush things if it detects a change. But that is tricky > for vfio and it means invalidations will have to iterate all address > spaces. So my patch is along these lines by actually moving the address space pointer into the device struct. The benefit is that: * The data structure for the DMA address space can be reused across IOMMUs, and the address spaces can be set up before bus numbers are assigned, and the implementation is fairly simple. * The IOMMU does not have to be notified of bus changes, except for invalidation purposes (but wouldn't a new enumeration cause a full IOMMU invalidate anyway?) The drawbacks are: * The IOMMUs get to know explicitly about devices behind a bridge, which logically deviates from how hardware works and complicates future attempts to implement bridges that translate RIDs. * Each device can have only one DMA address space mapping associated with it (I suppose it might be possible to have a topology that would allow multiple paths to a device, but do we care at this stage?) > The other option is to create Address Spaces on the fly as we lookup > domains, and bind them to devices lazily, but again, we need to deal > with changes/invalidations and that can be nasty with VFIO. We could get here without changing the interfaces, by refining the current implementation to just cache bus pointers at setup, then lazily add address spaces for each device. This approach would yield IOMMU device specific implementations, but would still in practice associate devices with address spaces. > Sadly, I can't think of a silver bullet. I agree, The latter approach is better handled restarting from the current code, as my patch depends too much on the interface change. Thanks, Knut > Cheers, > Ben. > > > Alex > > > > > > I would suggest to address them so it will be easier to continue > > > > the > > > > review process. > > > > > > Knut > > > > > > > Thank you, > > > > Marcel > > > > > > > > > > > > > > This is the thread following the initial patch set: > > > > > > > > > > http://thread.gmane.org/gmane.comp.emulators.qemu/302246 > > > > > > > > > > The patch set was also discussed in this thread: > > > > > > > > > > http://thread.gmane.org/gmane.comp.emulators.qemu/316949 > > > > > > > > > > Changes from v1: > > > > > - Rebased to current master > > > > > - Fixed minor syntax issues > > > > > > > > > > Knut Omang (2): > > > > > iommu: Replace bus+devfn arguments with PCIDevice* in > > > > > PCIIOMMUFunc > > > > > intel_iommu: Add support for translation for devices behind > > > > > bridges. > > > > > > > > > > hw/alpha/typhoon.c | 2 +- > > > > > hw/i386/intel_iommu.c | 56 +++++++++++++++++++------- > > > > > ---- > > > > > ------------- > > > > > hw/pci-host/apb.c | 2 +- > > > > > hw/pci-host/prep.c | 3 +-- > > > > > hw/pci-host/q35.c | 42 +++++++++++++------------- > > > > > ---- > > > > > -- > > > > > hw/pci/pci.c | 7 +++--- > > > > > hw/pci/pci_bridge.c | 6 +++++ > > > > > hw/ppc/spapr_pci.c | 2 +- > > > > > include/hw/i386/intel_iommu.h | 6 +++-- > > > > > include/hw/pci/pci.h | 5 +++- > > > > > 10 files changed, 62 insertions(+), 69 deletions(-) > > > > > > > > > > -- > > > > > 2.4.3 > > > > > > > > > > > > >