* Jag Raman (jag.ra...@oracle.com) wrote: > > > > On Jan 19, 2022, at 7:12 PM, Michael S. Tsirkin <m...@redhat.com> wrote: > > > > On Wed, Jan 19, 2022 at 04:41:52PM -0500, Jagannathan Raman wrote: > >> Allow PCI buses to be part of isolated CPU address spaces. This has a > >> niche usage. > >> > >> TYPE_REMOTE_MACHINE allows multiple VMs to house their PCI devices in > >> the same machine/server. This would cause address space collision as > >> well as be a security vulnerability. Having separate address spaces for > >> each PCI bus would solve this problem. > > > > Fascinating, but I am not sure I understand. any examples? > > Hi Michael! > > multiprocess QEMU and vfio-user implement a client-server model to allow > out-of-process emulation of devices. The client QEMU, which makes ioctls > to the kernel and runs VCPUs, could attach devices running in a server > QEMU. The server QEMU needs access to parts of the client’s RAM to > perform DMA.
Do you ever have the opposite problem? i.e. when an emulated PCI device exposes a chunk of RAM-like space (frame buffer, or maybe a mapped file) that the client can see. What happens if two emulated devices need to access each others emulated address space? Dave > In the case where multiple clients attach devices that are running on the > same server, we need to ensure that each devices has isolated memory > ranges. This ensures that the memory space of one device is not visible > to other devices in the same server. > > > > > I also wonder whether this special type could be modelled like a special > > kind of iommu internally. > > Could you please provide some more details on the design? > > > > >> Signed-off-by: Elena Ufimtseva <elena.ufimts...@oracle.com> > >> Signed-off-by: John G Johnson <john.g.john...@oracle.com> > >> Signed-off-by: Jagannathan Raman <jag.ra...@oracle.com> > >> --- > >> include/hw/pci/pci.h | 2 ++ > >> include/hw/pci/pci_bus.h | 17 +++++++++++++++++ > >> hw/pci/pci.c | 17 +++++++++++++++++ > >> hw/pci/pci_bridge.c | 5 +++++ > >> 4 files changed, 41 insertions(+) > >> > >> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h > >> index 023abc0f79..9bb4472abc 100644 > >> --- a/include/hw/pci/pci.h > >> +++ b/include/hw/pci/pci.h > >> @@ -387,6 +387,8 @@ void pci_device_save(PCIDevice *s, QEMUFile *f); > >> int pci_device_load(PCIDevice *s, QEMUFile *f); > >> MemoryRegion *pci_address_space(PCIDevice *dev); > >> MemoryRegion *pci_address_space_io(PCIDevice *dev); > >> +AddressSpace *pci_isol_as_mem(PCIDevice *dev); > >> +AddressSpace *pci_isol_as_io(PCIDevice *dev); > >> > >> /* > >> * Should not normally be used by devices. For use by sPAPR target > >> diff --git a/include/hw/pci/pci_bus.h b/include/hw/pci/pci_bus.h > >> index 347440d42c..d78258e79e 100644 > >> --- a/include/hw/pci/pci_bus.h > >> +++ b/include/hw/pci/pci_bus.h > >> @@ -39,9 +39,26 @@ struct PCIBus { > >> void *irq_opaque; > >> PCIDevice *devices[PCI_SLOT_MAX * PCI_FUNC_MAX]; > >> PCIDevice *parent_dev; > >> + > >> MemoryRegion *address_space_mem; > >> MemoryRegion *address_space_io; > >> > >> + /** > >> + * Isolated address spaces - these allow the PCI bus to be part > >> + * of an isolated address space as opposed to the global > >> + * address_space_memory & address_space_io. > > > > Are you sure address_space_memory & address_space_io are > > always global? even in the case of an iommu? > > On the CPU side of the Root Complex, I believe address_space_memory > & address_space_io are global. > > In the vfio-user case, devices on the same machine (TYPE_REMOTE_MACHINE) > could be attached to different clients VMs. Each client would have their own > address > space for their CPUs. With isolated address spaces, we ensure that the devices > see the address space of the CPUs they’re attached to. > > Not sure if it’s OK to share weblinks in this mailing list, please let me > know if that’s > not preferred. But I’m referring to the terminology used in the following > block diagram: > https://en.wikipedia.org/wiki/Root_complex#/media/File:Example_PCI_Express_Topology.svg > > > > >> This allows the > >> + * bus to be attached to CPUs from different machines. The > >> + * following is not used used commonly. > >> + * > >> + * TYPE_REMOTE_MACHINE allows emulating devices from multiple > >> + * VM clients, > > > > what are VM clients? > > It’s the client in the client - server model explained above. > > Thank you! > -- > Jag > > > > >> as such it needs the PCI buses in the same machine > >> + * to be part of different CPU address spaces. The following is > >> + * useful in that scenario. > >> + * > >> + */ > >> + AddressSpace *isol_as_mem; > >> + AddressSpace *isol_as_io; > >> + > >> QLIST_HEAD(, PCIBus) child; /* this will be replaced by qdev later */ > >> QLIST_ENTRY(PCIBus) sibling;/* this will be replaced by qdev later */ > >> > >> diff --git a/hw/pci/pci.c b/hw/pci/pci.c > >> index 5d30f9ca60..d5f1c6c421 100644 > >> --- a/hw/pci/pci.c > >> +++ b/hw/pci/pci.c > >> @@ -442,6 +442,8 @@ static void pci_root_bus_internal_init(PCIBus *bus, > >> DeviceState *parent, > >> bus->slot_reserved_mask = 0x0; > >> bus->address_space_mem = address_space_mem; > >> bus->address_space_io = address_space_io; > >> + bus->isol_as_mem = NULL; > >> + bus->isol_as_io = NULL; > >> bus->flags |= PCI_BUS_IS_ROOT; > >> > >> /* host bridge */ > >> @@ -2676,6 +2678,16 @@ MemoryRegion *pci_address_space_io(PCIDevice *dev) > >> return pci_get_bus(dev)->address_space_io; > >> } > >> > >> +AddressSpace *pci_isol_as_mem(PCIDevice *dev) > >> +{ > >> + return pci_get_bus(dev)->isol_as_mem; > >> +} > >> + > >> +AddressSpace *pci_isol_as_io(PCIDevice *dev) > >> +{ > >> + return pci_get_bus(dev)->isol_as_io; > >> +} > >> + > >> static void pci_device_class_init(ObjectClass *klass, void *data) > >> { > >> DeviceClass *k = DEVICE_CLASS(klass); > >> @@ -2699,6 +2711,7 @@ static void pci_device_class_base_init(ObjectClass > >> *klass, void *data) > >> > >> AddressSpace *pci_device_iommu_address_space(PCIDevice *dev) > >> { > >> + AddressSpace *iommu_as = NULL; > >> PCIBus *bus = pci_get_bus(dev); > >> PCIBus *iommu_bus = bus; > >> uint8_t devfn = dev->devfn; > >> @@ -2745,6 +2758,10 @@ AddressSpace > >> *pci_device_iommu_address_space(PCIDevice *dev) > >> if (!pci_bus_bypass_iommu(bus) && iommu_bus && iommu_bus->iommu_fn) { > >> return iommu_bus->iommu_fn(bus, iommu_bus->iommu_opaque, devfn); > >> } > >> + iommu_as = pci_isol_as_mem(dev); > >> + if (iommu_as) { > >> + return iommu_as; > >> + } > >> return &address_space_memory; > >> } > >> > >> diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c > >> index da34c8ebcd..98366768d2 100644 > >> --- a/hw/pci/pci_bridge.c > >> +++ b/hw/pci/pci_bridge.c > >> @@ -383,6 +383,11 @@ void pci_bridge_initfn(PCIDevice *dev, const char > >> *typename) > >> sec_bus->address_space_io = &br->address_space_io; > >> memory_region_init(&br->address_space_io, OBJECT(br), "pci_bridge_io", > >> 4 * GiB); > >> + > >> + /* This PCI bridge puts the sec_bus in its parent's address space */ > >> + sec_bus->isol_as_mem = pci_isol_as_mem(dev); > >> + sec_bus->isol_as_io = pci_isol_as_io(dev); > >> + > >> br->windows = pci_bridge_region_init(br); > >> QLIST_INIT(&sec_bus->child); > >> QLIST_INSERT_HEAD(&parent->child, sec_bus, sibling); > >> -- > >> 2.20.1 > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK