On Thu, Mar 31, 2022 at 10:47:33AM +0100, Stefan Hajnoczi wrote:
> On Wed, Mar 30, 2022 at 01:13:03PM -0400, Peter Xu wrote:
> > On Wed, Mar 30, 2022 at 05:08:24PM +0100, Stefan Hajnoczi wrote:
> > > On Wed, Mar 30, 2022 at 08:53:16AM -0400, Peter Xu wrote:
> > > > On Wed, Mar 30, 2022 at 11:04:24AM +0100, Stefan Hajnoczi wrote:
> > > > > This makes me wonder whether there is a deeper issue with the
> > > > > pci_setup_iommu() API: the lack of per-device cleanup callbacks.
> > > > > Per-device IOMMU resources should be freed when a device is hot
> > > > > unplugged.
> > > > > 
> > > > > From what I can tell this is not the case today:
> > > > > 
> > > > > - hw/i386/intel_iommu.c:vtd_find_add_as() allocates and adds device
> > > > >   address spaces but I can't find where they are removed and freed.
> > > > >   VTDAddressSpace instances pointed to from vtd_bus->dev_as[] are 
> > > > > leaked.
> > > > > 
> > > > > - hw/i386/amd_iommu.c has similar leaks.
> > > > 
> > > > AFAICT it's because there's no device-specific data cached in the
> > > > per-device IOMMU address space, at least so far.  IOW, all the data
> > > > structures allocated here can be re-used when a new device is plugged in
> > > > after the old device unplugged.
> > > > 
> > > > It's definitely not ideal since after unplug (and before a new device
> > > > plugged in) the resource is not needed at all so it's kind of wasted, 
> > > > but
> > > > it should work functionally.  If to achieve that, some iommu_unplug() or
> > > > iommu_cleanup() hook sounds reasonable.
> > > 
> > > I guess the question is whether PCI busses can be hotplugged with
> > > IOMMUs. If yes, then there is a memory leak that matters for
> > > intel_iommu.c and amd_iommu.c.
> > 
> > It can't, and we only support one vIOMMU so far for both (commit
> > 1b3bf13890fd849b26).  Thanks,
> 
> I see, thanks!
> 
> Okay, summarizing options for the vfio-user IOMMU:
> 
> 1. Use the same singleton approach as existing IOMMUs where the
>    MemoryRegion/AddressSpace are never freed. Don't bother deleting.
> 
> 2. Keep the approach in this patch where vfio-user code manually calls a
>    custom delete function (not part of the pci_setup_iommu() API). This
>    is slightly awkward to do without global state and that's what
>    started this discussion.
> 
> 3. Introduce an optional pci_setup_iommu() callback:
> 
>    typdef void (*PCIIOMMUDeviceUnplug)(PCIBus *bus, void *opaque, int devfn);
> 
>    Solves the awkwardness of option #2. Not needed by existing IOMMU
>    devices.

Looks all workable to me.  One tiny thing is if we'd like 3) we may want to
pass over the PCIDevice* too because in this case IIUC we'd need to double
check the device class before doing anything - we may not want to call the
vfio-user callbacks for general emulated devices under the same pci bus.

I think we could also fetch that from PCIBus.devices[devfn] but that's just
not as obvious.

Option 4) is as mentioned previously, that we add another device unplug
hook that can be registered per-device.  I just didn't think thoroughly on
how it would interact with the current HotplugHandler design yet.. it looks
quite similar but so far it's either for the machine type or pci bus, not
capable of registering on one single device (and it's always a mistery to
me why we'd rather ignore the per-bus hook if the machine hook
existed.. that's in qdev_get_hotplug_handler).

Copying Igor too.

-- 
Peter Xu


Reply via email to