Hi Alex, Thanks very much for reviewing the patches.
> On July 28, 2023 1:25 AM, Alex Williamson <alex.william...@redhat.com> wrote: > > On Thu, 27 Jul 2023 03:24:08 -0400 > Jing Liu <jing2....@intel.com> wrote: > > > From: Reinette Chatre <reinette.cha...@intel.com> > > > > Kernel provides the guidance of dynamic MSI-X allocation support of > > passthrough device, by clearing the VFIO_IRQ_INFO_NORESIZE flag to > > guide user space. > > > > Fetch and store the flags from host for later use to determine if > > specific flags are set. > > > > Signed-off-by: Reinette Chatre <reinette.cha...@intel.com> > > Signed-off-by: Jing Liu <jing2....@intel.com> > > --- > > hw/vfio/pci.c | 12 ++++++++++++ > > hw/vfio/pci.h | 1 + > > hw/vfio/trace-events | 2 ++ > > 3 files changed, 15 insertions(+) > > > > diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index > > a205c6b1130f..0c4ac0873d40 100644 > > --- a/hw/vfio/pci.c > > +++ b/hw/vfio/pci.c > > @@ -1572,6 +1572,7 @@ static void vfio_msix_early_setup(VFIOPCIDevice > > *vdev, Error **errp) > > > > static int vfio_msix_setup(VFIOPCIDevice *vdev, int pos, Error > > **errp) { > > + struct vfio_irq_info irq_info = { .argsz = sizeof(irq_info) }; > > int ret; > > Error *err = NULL; > > > > @@ -1624,6 +1625,17 @@ static int vfio_msix_setup(VFIOPCIDevice *vdev, int > pos, Error **errp) > > memory_region_set_enabled(&vdev->pdev.msix_table_mmio, false); > > } > > > > + irq_info.index = VFIO_PCI_MSIX_IRQ_INDEX; > > + ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_IRQ_INFO, &irq_info); > > + if (ret) { > > + /* This can fail for an old kernel or legacy PCI dev */ > > + trace_vfio_msix_setup_get_irq_info_failure(strerror(errno)); > > We only call vfio_msix_setup() if the device has an MSI-X capability, so the > "legacy PCI" portion of this comment seems unjustified. > Otherwise the GET_IRQ_INFO ioctl has always existed, so I'd also question the > "old kernel" part of this comment. Oh, yes, I just realize that only VFIO_PCI_ERR_IRQ_INDEX and VFIO_PCI_REQ_IRQ_INDEX were added later in include/uapi/linux/vfio.h. Thus, this ioctl() with MSIX index would not fail by the old-kernel or legacy-PCI reason. Thanks for pointing out this to me. We don't currently sanity test the device > exposed MSI-X info versus that reported by GET_IRQ_INFO, but it seems valid to > do so. Do we want to keep the check of possible failure from kernel (e.g., -EFAULT) and report the error code back to caller? Maybe like this, static void vfio_msix_early_setup(VFIOPCIDevice *vdev, Error **errp) { .... msix->entries = (ctrl & PCI_MSIX_FLAGS_QSIZE) + 1; ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_IRQ_INFO, &irq_info); if (ret < 0) { error_setg_errno(errp, -ret, "failed to get MSI-X IRQ INFO"); return; } else { vdev->msix->noresize = !!(irq_info.flags & VFIO_IRQ_INFO_NORESIZE); } ... trace_vfio_msix_early_setup(vdev->vbasedev.name, pos, msix->table_bar, msix->table_offset, msix->entries, vdev->msix->noresize); .... } > I'd expect this to happen in vfio_msix_early_setup() though, especially > since that's where the remainder of VFIOMSIXInfo is setup. > > > + } else { > > + vdev->msix->irq_info_flags = irq_info.flags; > > + } > > + trace_vfio_msix_setup_irq_info_flags(vdev->vbasedev.name, > > + vdev->msix->irq_info_flags); > > + > > return 0; > > } > > > > diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h index > > a2771b9ff3cc..ad34ec56d0ae 100644 > > --- a/hw/vfio/pci.h > > +++ b/hw/vfio/pci.h > > @@ -113,6 +113,7 @@ typedef struct VFIOMSIXInfo { > > uint32_t table_offset; > > uint32_t pba_offset; > > unsigned long *pending; > > + uint32_t irq_info_flags; > > Why not simply pull out a "noresize" bool? Thanks, > Will change to a bool type. Thanks, Jing > Alex > > > } VFIOMSIXInfo; > > > > #define TYPE_VFIO_PCI "vfio-pci" > > diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events index > > ee7509e68e4f..7d4a398f044d 100644 > > --- a/hw/vfio/trace-events > > +++ b/hw/vfio/trace-events > > @@ -28,6 +28,8 @@ vfio_pci_read_config(const char *name, int addr, int > > len, int val) " (%s, @0x%x, vfio_pci_write_config(const char *name, int > > addr, int > val, int len) " (%s, @0x%x, 0x%x, len=0x%x)" > > vfio_msi_setup(const char *name, int pos) "%s PCI MSI CAP @0x%x" > > vfio_msix_early_setup(const char *name, int pos, int table_bar, int > > offset, int > entries) "%s PCI MSI-X CAP @0x%x, BAR %d, offset 0x%x, entries %d" > > +vfio_msix_setup_get_irq_info_failure(const char *errstr) > "VFIO_DEVICE_GET_IRQ_INFO failure: %s" > > +vfio_msix_setup_irq_info_flags(const char *name, uint32_t flags) " (%s) > > MSI-X > irq info flags 0x%x" > > vfio_check_pcie_flr(const char *name) "%s Supports FLR via PCIe cap" > > vfio_check_pm_reset(const char *name) "%s Supports PM reset" > > vfio_check_af_flr(const char *name) "%s Supports FLR via AF cap"