Il 09/08/2013 13:21, Alexey Kardashevskiy ha scritto: > On 08/09/2013 09:07 PM, Paolo Bonzini wrote: >> Il 09/08/2013 12:58, Alexey Kardashevskiy ha scritto: >>> On 08/09/2013 08:19 PM, Paolo Bonzini wrote: >>>> Il 09/08/2013 12:13, Alexey Kardashevskiy ha scritto: >>>>> On 08/09/2013 07:53 PM, Paolo Bonzini wrote: >>>>>> Il 09/08/2013 11:48, Alexey Kardashevskiy ha scritto: >>>>>>> On 08/09/2013 07:40 PM, Paolo Bonzini wrote: >>>>>>>> Il 09/08/2013 10:49, Alexey Kardashevskiy ha scritto: >>>>>>>>> A PCI device's DMA address space (possibly an IOMMU) is returned by a >>>>>>>>> method on the PCIBus. At the moment that only has one caller, so the >>>>>>>>> method is simply open coded. We'll need another caller for VFIO, so >>>>>>>>> this patch introduces a helper/wrapper function. >>>>>>>>> >>>>>>>>> Signed-off-by: David Gibson <da...@gibson.dropbear.id.au> >>>>>>>>> [aik: added inheritance from parent if iommu is not set for the >>>>>>>>> current bus] >>>>>>>>> Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> >>>>>>>>> >>>>>>>>> --- >>>>>>>>> Changes: >>>>>>>>> v2: >>>>>>>>> * added inheritance, needed for a pci-bridge on spapr-ppc64 >>>>>>>>> * pci_iommu_as renamed to pci_device_iommu_address_space >>>>>>>>> --- >>>>>>>>> hw/pci/pci.c | 22 ++++++++++++++++------ >>>>>>>>> include/hw/pci/pci.h | 1 + >>>>>>>>> 2 files changed, 17 insertions(+), 6 deletions(-) >>>>>>>>> >>>>>>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c >>>>>>>>> index 4c004f5..0072b54 100644 >>>>>>>>> --- a/hw/pci/pci.c >>>>>>>>> +++ b/hw/pci/pci.c >>>>>>>>> @@ -812,12 +812,7 @@ static PCIDevice >>>>>>>>> *do_pci_register_device(PCIDevice *pci_dev, PCIBus *bus, >>>>>>>>> } >>>>>>>>> >>>>>>>>> pci_dev->bus = bus; >>>>>>>>> - if (bus->iommu_fn) { >>>>>>>>> - dma_as = bus->iommu_fn(bus, bus->iommu_opaque, devfn); >>>>>>>>> - } else { >>>>>>>>> - /* FIXME: inherit memory region from bus creator */ >>>>>>>>> - dma_as = &address_space_memory; >>>>>>>>> - } >>>>>>>>> + dma_as = pci_device_iommu_address_space(pci_dev); >>>>>>>>> >>>>>>>>> memory_region_init_alias(&pci_dev->bus_master_enable_region, >>>>>>>>> OBJECT(pci_dev), "bus master", >>>>>>>>> @@ -2239,6 +2234,21 @@ static void pci_device_class_init(ObjectClass >>>>>>>>> *klass, void *data) >>>>>>>>> k->props = pci_props; >>>>>>>>> } >>>>>>>>> >>>>>>>>> +AddressSpace *pci_device_iommu_address_space(PCIDevice *dev) >>>>>>>>> +{ >>>>>>>>> + PCIBus *bus = PCI_BUS(dev->bus); >>>>>>>>> + >>>>>>>>> + if (bus->iommu_fn) { >>>>>>>>> + return bus->iommu_fn(bus, bus->iommu_opaque, dev->devfn); >>>>>>>>> + } >>>>>>>>> + >>>>>>>>> + if (bus->parent_dev) { >>>>>>>>> + return pci_device_iommu_address_space(bus->parent_dev); >>>>>>>>> + } >>>>>>>> >>>>>>>> No, this would fail if bus->parent_dev is not NULL but not a PCI device >>>>>>>> either. >>>>>>> >>>>>>> parent_dev is of the PCIDevice* type, how can it be not a PCI device? >>>>>>> :-/ >>>>>> >>>>>> Doh, I misread the code, I thought it was the "parent" field in >>>>>> BusState. Why do we have parent_dev at all? >>>>> >>>>> The code is too old? Don't know. >>>>> >>>>> >>>>>>>> You can use object_dynamic_cast to convert the parent_dev to >>>>>>>> PCIDevice, and if the cast succeeds you call the new function. >>>>>>>> >>>>>>>> Perhaps you could make the new function take a PCIBus instead. >>>>>>>> Accessing the PCIDevice's IOMMU address space (as opposed to the >>>>>>>> bus-master address space) doesn't make much sense, VFIO is really a >>>>>>>> special case. Putting the new API on the bus side instead looks >>>>>>>> better. >>>>>>>> >>>>>>>> (BTW, do you need to enable bus-master DMA on PCI bridges, to do DMA >>>>>>>> for >>>>>>>> devices sitting on the secondary bus?) >>>>>>> >>>>>>> It happens naturally I guess when linux enables devices. >>>>>> >>>>>> Yes, but then using the IOMMU address space would be wrong; you would >>>>>> have to use the bus-master address space as a base for the child's >>>>>> bus-master address space. >>>>> >>>>> >>>>> Like this? Works too. >>>>> >>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c >>>>> index 8bdcedc..a4c70e6 100644 >>>>> --- a/hw/pci/pci.c >>>>> +++ b/hw/pci/pci.c >>>>> @@ -2247,23 +2247,23 @@ AddressSpace >>>>> *pci_device_iommu_address_space(PCIDevice *dev) >>>>> { >>>>> PCIBus *bus = PCI_BUS(dev->bus); >>>>> >>>>> if (bus->iommu_fn) { >>>>> return bus->iommu_fn(bus, bus->iommu_opaque, dev->devfn); >>>>> } >>>>> >>>>> if (bus->parent_dev) { >>>>> return pci_device_iommu_address_space(bus->parent_dev); >>>>> } >>>>> >>>>> - return &address_space_memory; >>>>> + return &dev->bus_master_as; >>>>> } >>>> >>>> I was thinking more like this: >>>> >>>> if (bus->parent_dev) { >>>> - return pci_device_iommu_address_space(bus->parent_dev); >>>> + /* Take parent device's bus-master enable bit into account. */ >>>> + return pci_get_address_space(bus->parent_dev); >>>> } >>>> >>>> + /* Not a secondary bus and no IOMMU. Use system memory. */ >>>> return &address_space_memory; >>> >>> Ok. >>> >>> Oh. BTW. This "bus master" thing now breaks VFIO's check for all devices >>> being in the same address space as every single device has its own "bus >>> master" AddressSpace. >> >> Yes, fixing that check is another good use of the new API you introduced. >> >> But after Ben's answer, I guess the above change is not really needed. >> It would add complication for VFIO, too. Proper emulation would require >> QEMU to trap writes to the device's bus-master bit. QEMU would have to >> take of the value that the guest writes, AND it with the bus-master >> bit(s) of all bridges between the host bridge and the VFIO device, and >> write the result to the passed-through device. This is because the >> bridges are emulated, and do not exist in real hardware. > > So does this mean that we go with the original patch and ignore bus master > address space here?
Yes. Just add a comment that we are ignoring the bus master DMA bit of the bridge. > I guess I could overcome that VFIO check by comparing not just AddressSpace > but AddressSpace->root if AddressSpace is different but it does not make a > lot of sense. Paolo