Jason Gunthorpe <j...@nvidia.com> writes:

> On Tue, Jun 03, 2025 at 02:20:51PM +0800, Xu Yilun wrote:
>> > Wouldn’t it be simpler to skip the reference count increment altogether
>> > and just call tsm_unbind in the virtual device’s destroy callback?
>> > (iommufd_vdevice_destroy())
>> 
>> The vdevice refcount is the main concern, there is also an IOMMU_DESTROY
>> ioctl. User could just free the vdevice instance if no refcount, while VFIO
>> is still in bound state. That seems not the correct free order.
>
> Freeing the vdevice should automatically unbind it..
>

One challenge I ran into during implementation was the dependency of
vfio on iommufd_device. When vfio needs to perform a tsm_unbind,
it only has access to an iommufd_device.

However, TSM operations like binding and unbinding are handled at the
iommufd_vdevice level. The issue? There’s no direct link from
iommufd_device back to iommufd_vdevice.

To address this, I modified the following structures:

modified   drivers/iommu/iommufd/iommufd_private.h
@@ -428,6 +428,7 @@ struct iommufd_device {
        /* protect iopf_enabled counter */
        struct mutex iopf_lock;
        unsigned int iopf_enabled;
+       struct iommufd_vdevice *vdev;
 };
 
 static inline struct iommufd_device *
@@ -613,6 +614,7 @@ struct iommufd_vdevice {
        struct iommufd_object obj;
        struct iommufd_ctx *ictx;
        struct iommufd_viommu *viommu;
+       struct iommufd_device *idev;
        struct device *dev;
        struct mutex    mutex;  /* mutex to synchronize updates to tsm_bound */
        u64 id; /* per-vIOMMU virtual ID */

These fields are updated during tsm_bind and tsm_unbind, so they must be
protected by the appropriate locks:

Updating vdevice->idev requires holding vdev->mutex (vdev_lock).
Updating device->vdev requires idev->igroup->lock (idev_lock).

tsm_unbind in vdevice_destroy:

vdevice_destroy() ends up calling tsm_unbind() while holding only the
vdev_lock. At first glance, this seems unsafe. But in practice, it's
fine because the corresponding iommufd_device has already been destroyed
when the VFIO device file descriptor was closed—triggering
vfio_df_iommufd_unbind().

I’ve added an in-code comment to explain why tsm_unbind() is safe here
without acquiring the idev_lock. Hope that is ok.

-aneesh

Reply via email to