>-----Original Message----- >From: Cédric Le Goater <c...@redhat.com> >Subject: Re: [PATCH rfcv1 4/6] vfio: initialize IOMMUFDDevice and pass to >vIOMMU > >On 1/23/24 10:46, Duan, Zhenzhong wrote: >> >> >>> -----Original Message----- >>> From: Cédric Le Goater <c...@redhat.com> >>> Subject: Re: [PATCH rfcv1 4/6] vfio: initialize IOMMUFDDevice and pass >to >>> vIOMMU >>> >>> On 1/15/24 11:13, Zhenzhong Duan wrote: >>>> Initialize IOMMUFDDevice in vfio and pass to vIOMMU, so that vIOMMU >>>> could get hw IOMMU information. >>>> >>>> In VFIO legacy backend mode, we still pass a NULL IOMMUFDDevice to >>> vIOMMU, >>>> in case vIOMMU needs some processing for VFIO legacy backend mode. >>>> >>>> Originally-by: Yi Liu <yi.l....@intel.com> >>>> Signed-off-by: Nicolin Chen <nicol...@nvidia.com> >>>> Signed-off-by: Yi Sun <yi.y....@linux.intel.com> >>>> Signed-off-by: Zhenzhong Duan <zhenzhong.d...@intel.com> >>>> --- >>>> include/hw/vfio/vfio-common.h | 2 ++ >>>> hw/vfio/iommufd.c | 2 ++ >>>> hw/vfio/pci.c | 24 +++++++++++++++++++----- >>>> 3 files changed, 23 insertions(+), 5 deletions(-) >>>> >>>> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio- >>> common.h >>>> index 9b7ef7d02b..fde0d0ca60 100644 >>>> --- a/include/hw/vfio/vfio-common.h >>>> +++ b/include/hw/vfio/vfio-common.h >>>> @@ -31,6 +31,7 @@ >>>> #endif >>>> #include "sysemu/sysemu.h" >>>> #include "hw/vfio/vfio-container-base.h" >>>> +#include "sysemu/iommufd_device.h" >>>> >>>> #define VFIO_MSG_PREFIX "vfio %s: " >>>> >>>> @@ -126,6 +127,7 @@ typedef struct VFIODevice { >>>> bool dirty_tracking; >>>> int devid; >>>> IOMMUFDBackend *iommufd; >>>> + IOMMUFDDevice idev; >>>> } VFIODevice; >>>> >>>> struct VFIODeviceOps { >>>> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c >>>> index 9bfddc1360..cbd035f148 100644 >>>> --- a/hw/vfio/iommufd.c >>>> +++ b/hw/vfio/iommufd.c >>>> @@ -309,6 +309,7 @@ static int iommufd_cdev_attach(const char >*name, >>> VFIODevice *vbasedev, >>>> VFIOContainerBase *bcontainer; >>>> VFIOIOMMUFDContainer *container; >>>> VFIOAddressSpace *space; >>>> + IOMMUFDDevice *idev = &vbasedev->idev; >>>> struct vfio_device_info dev_info = { .argsz = sizeof(dev_info) }; >>>> int ret, devfd; >>>> uint32_t ioas_id; >>>> @@ -428,6 +429,7 @@ found_container: >>>> QLIST_INSERT_HEAD(&bcontainer->device_list, vbasedev, >>> container_next); >>>> QLIST_INSERT_HEAD(&vfio_device_list, vbasedev, global_next); >>>> >>>> + iommufd_device_init(idev, sizeof(*idev), container->be, vbasedev- >>>> devid); >>>> trace_iommufd_cdev_device_info(vbasedev->name, devfd, >vbasedev- >>>> num_irqs, >>>> vbasedev->num_regions, >>>> vbasedev->flags); >>>> return 0; >>>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c >>>> index d7fe06715c..2c3a5d267b 100644 >>>> --- a/hw/vfio/pci.c >>>> +++ b/hw/vfio/pci.c >>>> @@ -3107,11 +3107,21 @@ static void vfio_realize(PCIDevice *pdev, >>> Error **errp) >>>> >>>> vfio_bars_register(vdev); >>>> >>>> - ret = vfio_add_capabilities(vdev, errp); >>>> + if (vbasedev->iommufd) { >>>> + ret = pci_device_set_iommu_device(pdev, &vbasedev->idev, errp); >>>> + } else { >>>> + ret = pci_device_set_iommu_device(pdev, 0, errp); >>> >>> >>> AFAICT, pci_device_set_iommu_device() with a NULL IOMMUFDDevice >will >>> do >>> nothing. Why call it ? >> >> We will do something in nesting series, see >https://github.com/yiliu1765/qemu/commit/7f0bb59575bb5cf38618ae950 >f68a8661676e881 > >ok, that's not much. idev is used as a capability bool and later on >to pass the /dev/iommu fd. We don't need to support the legacy mode ?
It's better to have for legacy mode. Especially when we support address width 57 to QEMU Intel_iommu in future. > >> Another choice is to call pci_device_set_iommu_device() no matter which >backend >> is used and check idev->iommufd in vtd_dev_set_iommu_device(). Is this >better >> for you? > >yes. Should be fine. There is more to it though. > >IIUC, what will determine most of the requirements, is the legacy >mode. We also need the host iommu info in that case. As said Eric, >ideally, we should introduce a common abstract "host-iommu-info" struct >and sub structs associated with the iommu backends (iommufd + legacy) >which would be allocated accordingly. I see, I'll make a rfcv2 as you and Eric suggested and discuss further with Eric what elements he needs in legacy sub structs. > >So, IOMMUFDDevice usage should be limited to the iommufd files. All PCI >files should use the common abstract type. We should define these data >structures first. They could be simple C struct for now. We will see if >QOM applies after. Got it. Thanks Zhenzhong