On Wed, Apr 13, 2022 at 04:37:35PM +0200, Igor Mammedov wrote: > On Thu, 31 Mar 2022 08:41:01 -0400 > Peter Xu <pet...@redhat.com> wrote: > > > On Thu, Mar 31, 2022 at 10:47:33AM +0100, Stefan Hajnoczi wrote: > > > On Wed, Mar 30, 2022 at 01:13:03PM -0400, Peter Xu wrote: > > > > On Wed, Mar 30, 2022 at 05:08:24PM +0100, Stefan Hajnoczi wrote: > > > > > On Wed, Mar 30, 2022 at 08:53:16AM -0400, Peter Xu wrote: > > > > > > On Wed, Mar 30, 2022 at 11:04:24AM +0100, Stefan Hajnoczi wrote: > > > > > > > This makes me wonder whether there is a deeper issue with the > > > > > > > pci_setup_iommu() API: the lack of per-device cleanup callbacks. > > > > > > > Per-device IOMMU resources should be freed when a device is hot > > > > > > > unplugged. > > > > > > > > > > > > > > From what I can tell this is not the case today: > > > > > > > > > > > > > > - hw/i386/intel_iommu.c:vtd_find_add_as() allocates and adds > > > > > > > device > > > > > > > address spaces but I can't find where they are removed and > > > > > > > freed. > > > > > > > VTDAddressSpace instances pointed to from vtd_bus->dev_as[] are > > > > > > > leaked. > > > > > > > > > > > > > > - hw/i386/amd_iommu.c has similar leaks. > > > > > > > > > > > > AFAICT it's because there's no device-specific data cached in the > > > > > > per-device IOMMU address space, at least so far. IOW, all the data > > > > > > structures allocated here can be re-used when a new device is > > > > > > plugged in > > > > > > after the old device unplugged. > > > > > > > > > > > > It's definitely not ideal since after unplug (and before a new > > > > > > device > > > > > > plugged in) the resource is not needed at all so it's kind of > > > > > > wasted, but > > > > > > it should work functionally. If to achieve that, some > > > > > > iommu_unplug() or > > > > > > iommu_cleanup() hook sounds reasonable. > > > > > > > > > > I guess the question is whether PCI busses can be hotplugged with > > > > > IOMMUs. If yes, then there is a memory leak that matters for > > > > > intel_iommu.c and amd_iommu.c. > > > > > > > > It can't, and we only support one vIOMMU so far for both (commit > > > > 1b3bf13890fd849b26). Thanks, > > > > > > I see, thanks! > > > > > > Okay, summarizing options for the vfio-user IOMMU: > > > > > > 1. Use the same singleton approach as existing IOMMUs where the > > > MemoryRegion/AddressSpace are never freed. Don't bother deleting. > > > > > > 2. Keep the approach in this patch where vfio-user code manually calls a > > > custom delete function (not part of the pci_setup_iommu() API). This > > > is slightly awkward to do without global state and that's what > > > started this discussion. > > > > > > 3. Introduce an optional pci_setup_iommu() callback: > > > > > > typdef void (*PCIIOMMUDeviceUnplug)(PCIBus *bus, void *opaque, int > > > devfn); > > > > > > Solves the awkwardness of option #2. Not needed by existing IOMMU > > > devices. > > > > Looks all workable to me. One tiny thing is if we'd like 3) we may want to > > pass over the PCIDevice* too because in this case IIUC we'd need to double > > check the device class before doing anything - we may not want to call the > > vfio-user callbacks for general emulated devices under the same pci bus. > > > > I think we could also fetch that from PCIBus.devices[devfn] but that's just > > not as obvious. > > > > Option 4) is as mentioned previously, that we add another device unplug > > hook that can be registered per-device. I just didn't think thoroughly on > can you expand on why per device hook is needed?
E.g. when the pci bus that contains the vfio-user device also contains another emulated device? Then IIUC we only want to call the vfio-user hook for the vfio-user device, not the rest ones on the same bus? Per-bus will work too, but again then the per-bus hook will need to first identify the PCIDevice* object so it'll work similarly as a per-device hook. > > > how it would interact with the current HotplugHandler design yet.. it looks > > quite similar but so far it's either for the machine type or pci bus, not > > capable of registering on one single device (and it's always a mistery to > > me why we'd rather ignore the per-bus hook if the machine hook > > existed.. that's in qdev_get_hotplug_handler). > > machine hook is there for bus-less devices mainly, if it's not defined > code will fallback to bus handler if any exists. > > However machine hook can also be used to override default hotplug chain > to do to implement non trivial plug/unplug flow. > for example see pc_get_hotplug_handler(), corresponding > pc_machine_device_[pre_plug|plug|unplug...]_cb() where > plug/unplug chain is altered for some PCI devices types. > Perhaps the same can be done for vfio. It just seems non-obvious, no? For example, if someone implementes a pci bus with hotplug_handler() being provided, it will surprise me a bit if it's triggered conditionally, depending on which machine type the bus is attached to, and whether this machine type has get_hotplug_handler(). Thanks, -- Peter Xu