On Wed, Apr 13, 2022 at 04:37:35PM +0200, Igor Mammedov wrote:
> On Thu, 31 Mar 2022 08:41:01 -0400
> Peter Xu <pet...@redhat.com> wrote:
> 
> > On Thu, Mar 31, 2022 at 10:47:33AM +0100, Stefan Hajnoczi wrote:
> > > On Wed, Mar 30, 2022 at 01:13:03PM -0400, Peter Xu wrote:  
> > > > On Wed, Mar 30, 2022 at 05:08:24PM +0100, Stefan Hajnoczi wrote:  
> > > > > On Wed, Mar 30, 2022 at 08:53:16AM -0400, Peter Xu wrote:  
> > > > > > On Wed, Mar 30, 2022 at 11:04:24AM +0100, Stefan Hajnoczi wrote:  
> > > > > > > This makes me wonder whether there is a deeper issue with the
> > > > > > > pci_setup_iommu() API: the lack of per-device cleanup callbacks.
> > > > > > > Per-device IOMMU resources should be freed when a device is hot
> > > > > > > unplugged.
> > > > > > > 
> > > > > > > From what I can tell this is not the case today:
> > > > > > > 
> > > > > > > - hw/i386/intel_iommu.c:vtd_find_add_as() allocates and adds 
> > > > > > > device
> > > > > > >   address spaces but I can't find where they are removed and 
> > > > > > > freed.
> > > > > > >   VTDAddressSpace instances pointed to from vtd_bus->dev_as[] are 
> > > > > > > leaked.
> > > > > > > 
> > > > > > > - hw/i386/amd_iommu.c has similar leaks.  
> > > > > > 
> > > > > > AFAICT it's because there's no device-specific data cached in the
> > > > > > per-device IOMMU address space, at least so far.  IOW, all the data
> > > > > > structures allocated here can be re-used when a new device is 
> > > > > > plugged in
> > > > > > after the old device unplugged.
> > > > > > 
> > > > > > It's definitely not ideal since after unplug (and before a new 
> > > > > > device
> > > > > > plugged in) the resource is not needed at all so it's kind of 
> > > > > > wasted, but
> > > > > > it should work functionally.  If to achieve that, some 
> > > > > > iommu_unplug() or
> > > > > > iommu_cleanup() hook sounds reasonable.  
> > > > > 
> > > > > I guess the question is whether PCI busses can be hotplugged with
> > > > > IOMMUs. If yes, then there is a memory leak that matters for
> > > > > intel_iommu.c and amd_iommu.c.  
> > > > 
> > > > It can't, and we only support one vIOMMU so far for both (commit
> > > > 1b3bf13890fd849b26).  Thanks,  
> > > 
> > > I see, thanks!
> > > 
> > > Okay, summarizing options for the vfio-user IOMMU:
> > > 
> > > 1. Use the same singleton approach as existing IOMMUs where the
> > >    MemoryRegion/AddressSpace are never freed. Don't bother deleting.
> > > 
> > > 2. Keep the approach in this patch where vfio-user code manually calls a
> > >    custom delete function (not part of the pci_setup_iommu() API). This
> > >    is slightly awkward to do without global state and that's what
> > >    started this discussion.
> > > 
> > > 3. Introduce an optional pci_setup_iommu() callback:
> > > 
> > >    typdef void (*PCIIOMMUDeviceUnplug)(PCIBus *bus, void *opaque, int 
> > > devfn);
> > > 
> > >    Solves the awkwardness of option #2. Not needed by existing IOMMU
> > >    devices.  
> > 
> > Looks all workable to me.  One tiny thing is if we'd like 3) we may want to
> > pass over the PCIDevice* too because in this case IIUC we'd need to double
> > check the device class before doing anything - we may not want to call the
> > vfio-user callbacks for general emulated devices under the same pci bus.
> > 
> > I think we could also fetch that from PCIBus.devices[devfn] but that's just
> > not as obvious.
> > 
> > Option 4) is as mentioned previously, that we add another device unplug
> > hook that can be registered per-device.  I just didn't think thoroughly on
> can you expand on why per device hook is needed?

E.g. when the pci bus that contains the vfio-user device also contains
another emulated device?  Then IIUC we only want to call the vfio-user hook
for the vfio-user device, not the rest ones on the same bus?

Per-bus will work too, but again then the per-bus hook will need to first
identify the PCIDevice* object so it'll work similarly as a per-device hook.

> 
> > how it would interact with the current HotplugHandler design yet.. it looks
> > quite similar but so far it's either for the machine type or pci bus, not
> > capable of registering on one single device (and it's always a mistery to
> > me why we'd rather ignore the per-bus hook if the machine hook
> > existed.. that's in qdev_get_hotplug_handler).
> 
> machine hook is there for bus-less devices mainly, if it's not defined
> code will fallback to bus handler if any exists.
> 
> However machine hook can also be used to override default hotplug chain
> to do to implement non trivial plug/unplug flow.
> for example see pc_get_hotplug_handler(), corresponding
> pc_machine_device_[pre_plug|plug|unplug...]_cb() where
> plug/unplug chain is altered for some PCI devices types.
> Perhaps the same can be done for vfio.

It just seems non-obvious, no?  For example, if someone implementes a pci
bus with hotplug_handler() being provided, it will surprise me a bit if
it's triggered conditionally, depending on which machine type the bus is
attached to, and whether this machine type has get_hotplug_handler().

Thanks,

-- 
Peter Xu


Reply via email to