On 08/09/2013 09:07 PM, Paolo Bonzini wrote: > Il 09/08/2013 12:58, Alexey Kardashevskiy ha scritto: >> On 08/09/2013 08:19 PM, Paolo Bonzini wrote: >>> Il 09/08/2013 12:13, Alexey Kardashevskiy ha scritto: >>>> On 08/09/2013 07:53 PM, Paolo Bonzini wrote: >>>>> Il 09/08/2013 11:48, Alexey Kardashevskiy ha scritto: >>>>>> On 08/09/2013 07:40 PM, Paolo Bonzini wrote: >>>>>>> Il 09/08/2013 10:49, Alexey Kardashevskiy ha scritto: >>>>>>>> A PCI device's DMA address space (possibly an IOMMU) is returned by a >>>>>>>> method on the PCIBus. At the moment that only has one caller, so the >>>>>>>> method is simply open coded. We'll need another caller for VFIO, so >>>>>>>> this patch introduces a helper/wrapper function. >>>>>>>> >>>>>>>> Signed-off-by: David Gibson <da...@gibson.dropbear.id.au> >>>>>>>> [aik: added inheritance from parent if iommu is not set for the >>>>>>>> current bus] >>>>>>>> Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> >>>>>>>> >>>>>>>> --- >>>>>>>> Changes: >>>>>>>> v2: >>>>>>>> * added inheritance, needed for a pci-bridge on spapr-ppc64 >>>>>>>> * pci_iommu_as renamed to pci_device_iommu_address_space >>>>>>>> --- >>>>>>>> hw/pci/pci.c | 22 ++++++++++++++++------ >>>>>>>> include/hw/pci/pci.h | 1 + >>>>>>>> 2 files changed, 17 insertions(+), 6 deletions(-) >>>>>>>> >>>>>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c >>>>>>>> index 4c004f5..0072b54 100644 >>>>>>>> --- a/hw/pci/pci.c >>>>>>>> +++ b/hw/pci/pci.c >>>>>>>> @@ -812,12 +812,7 @@ static PCIDevice >>>>>>>> *do_pci_register_device(PCIDevice *pci_dev, PCIBus *bus, >>>>>>>> } >>>>>>>> >>>>>>>> pci_dev->bus = bus; >>>>>>>> - if (bus->iommu_fn) { >>>>>>>> - dma_as = bus->iommu_fn(bus, bus->iommu_opaque, devfn); >>>>>>>> - } else { >>>>>>>> - /* FIXME: inherit memory region from bus creator */ >>>>>>>> - dma_as = &address_space_memory; >>>>>>>> - } >>>>>>>> + dma_as = pci_device_iommu_address_space(pci_dev); >>>>>>>> >>>>>>>> memory_region_init_alias(&pci_dev->bus_master_enable_region, >>>>>>>> OBJECT(pci_dev), "bus master", >>>>>>>> @@ -2239,6 +2234,21 @@ static void pci_device_class_init(ObjectClass >>>>>>>> *klass, void *data) >>>>>>>> k->props = pci_props; >>>>>>>> } >>>>>>>> >>>>>>>> +AddressSpace *pci_device_iommu_address_space(PCIDevice *dev) >>>>>>>> +{ >>>>>>>> + PCIBus *bus = PCI_BUS(dev->bus); >>>>>>>> + >>>>>>>> + if (bus->iommu_fn) { >>>>>>>> + return bus->iommu_fn(bus, bus->iommu_opaque, dev->devfn); >>>>>>>> + } >>>>>>>> + >>>>>>>> + if (bus->parent_dev) { >>>>>>>> + return pci_device_iommu_address_space(bus->parent_dev); >>>>>>>> + } >>>>>>> >>>>>>> No, this would fail if bus->parent_dev is not NULL but not a PCI device >>>>>>> either. >>>>>> >>>>>> parent_dev is of the PCIDevice* type, how can it be not a PCI device? :-/ >>>>> >>>>> Doh, I misread the code, I thought it was the "parent" field in >>>>> BusState. Why do we have parent_dev at all? >>>> >>>> The code is too old? Don't know. >>>> >>>> >>>>>>> You can use object_dynamic_cast to convert the parent_dev to >>>>>>> PCIDevice, and if the cast succeeds you call the new function. >>>>>>> >>>>>>> Perhaps you could make the new function take a PCIBus instead. >>>>>>> Accessing the PCIDevice's IOMMU address space (as opposed to the >>>>>>> bus-master address space) doesn't make much sense, VFIO is really a >>>>>>> special case. Putting the new API on the bus side instead looks better. >>>>>>> >>>>>>> (BTW, do you need to enable bus-master DMA on PCI bridges, to do DMA for >>>>>>> devices sitting on the secondary bus?) >>>>>> >>>>>> It happens naturally I guess when linux enables devices. >>>>> >>>>> Yes, but then using the IOMMU address space would be wrong; you would >>>>> have to use the bus-master address space as a base for the child's >>>>> bus-master address space. >>>> >>>> >>>> Like this? Works too. >>>> >>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c >>>> index 8bdcedc..a4c70e6 100644 >>>> --- a/hw/pci/pci.c >>>> +++ b/hw/pci/pci.c >>>> @@ -2247,23 +2247,23 @@ AddressSpace >>>> *pci_device_iommu_address_space(PCIDevice *dev) >>>> { >>>> PCIBus *bus = PCI_BUS(dev->bus); >>>> >>>> if (bus->iommu_fn) { >>>> return bus->iommu_fn(bus, bus->iommu_opaque, dev->devfn); >>>> } >>>> >>>> if (bus->parent_dev) { >>>> return pci_device_iommu_address_space(bus->parent_dev); >>>> } >>>> >>>> - return &address_space_memory; >>>> + return &dev->bus_master_as; >>>> } >>> >>> I was thinking more like this: >>> >>> if (bus->parent_dev) { >>> - return pci_device_iommu_address_space(bus->parent_dev); >>> + /* Take parent device's bus-master enable bit into account. */ >>> + return pci_get_address_space(bus->parent_dev); >>> } >>> >>> + /* Not a secondary bus and no IOMMU. Use system memory. */ >>> return &address_space_memory; >> >> Ok. >> >> Oh. BTW. This "bus master" thing now breaks VFIO's check for all devices >> being in the same address space as every single device has its own "bus >> master" AddressSpace. > > Yes, fixing that check is another good use of the new API you introduced. > > But after Ben's answer, I guess the above change is not really needed. > It would add complication for VFIO, too. Proper emulation would require > QEMU to trap writes to the device's bus-master bit. QEMU would have to > take of the value that the guest writes, AND it with the bus-master > bit(s) of all bridges between the host bridge and the VFIO device, and > write the result to the passed-through device. This is because the > bridges are emulated, and do not exist in real hardware.
So does this mean that we go with the original patch and ignore bus master address space here? I guess I could overcome that VFIO check by comparing not just AddressSpace but AddressSpace->root if AddressSpace is different but it does not make a lot of sense. > >>>>> Also, we would have to fix the x86 firmware. >> >> btw what would it mean? :) > > Firmware would have to look for bridges and enable bus master DMA on > them. Otherwise, for example a SCSI controller below a bridge (not > virtio-scsi) would fail to boot. > > Paolo > -- Alexey