On Wed, 1 Jun 2022 17:00:54 +0000 Jag Raman <jag.ra...@oracle.com> wrote: > > Hi Alex, > > Just to add some more detail, the emulated PCI device in QEMU presently > maintains a MSIx table (PCIDevice->msix_table) and Pending Bit Array. In the > present VFIO PCI device implementation, QEMU leverages the same > MSIx table for interrupt masking/unmasking. The backend PCI device (such as > the passthru device) always thinks that the interrupt is unmasked and lets > QEMU manage masking. > > Whereas in the vfio-user case, the client additionally pushes a copy of > emulated PCI device’s table downstream to the remote device. We did this > to allow a small set of devices (such as e1000e) to clear the > PBA (msix_clr_pending()). Secondly, the remote device uses its copy of the > MSIx table to determine if interrupt should be triggered - this would prevent > an interrupt from being sent to the client unnecessarily if it's masked. > > We are wondering if pushing the MSIx table to the remote device and > reading PBA from it would diverge from the VFIO protocol specification? > > From your comment, I understand it’s similar to VFIO protocol because VFIO > clients could mask an interrupt using VFIO_DEVICE_SET_IRQS ioctl + > VFIO_IRQ_SET_ACTION_MASK / _UNMASK flags. I observed that QEMU presently > does not use this approach and the kernel does not support it for MSI.
I believe the SET_IRQS ioctl definition is pre-enabled to support masking and unmasking, we've just lacked kernel support to mask at the device which leads to the hybrid approach we have today. Our intention would be to use the current uAPI, to provide that masking support, at which point we'd leave the PBA mapped to the device. So whether your proposal diverges from the VFIO uAPI depends on what you mean by "pushing the MSIx table to the remote device". If that's done by implementing the existing SET_IRQS masking support, then you're spot on. OTOH, if you're actually pushing a copy of the MSIx table from the client, that's certainly not how I had envisioned the kernel interface. Thanks, Alex