During migration restoring, vfio_enable_vectors() is called to restore
enabling MSI-X interrupts for assigned devices. It sets the range from 0
to nr_vectors to kernel to enable MSI-X and the vectors unmasked in
guest. During the MSI-X enabling, all the vectors within the range are
allocated according to the ioctl().

When dynamic MSI-X allocation is supported, we only want the guest
unmasked vectors being allocated and enabled. Therefore, Qemu can first
set vector 0 to enable MSI-X and after that, all the vectors can be
allocated in need.

Signed-off-by: Jing Liu <jing2....@intel.com>
---
 hw/vfio/pci.c | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 8c485636445c..43ffacd5b36a 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -375,6 +375,38 @@ static int vfio_enable_vectors(VFIOPCIDevice *vdev, bool 
msix)
     int ret = 0, i, argsz;
     int32_t *fds;
 
+    /*
+     * If dynamic MSI-X allocation is supported, the vectors to be allocated
+     * and enabled can be scattered. Before kernel enabling MSI-X, setting
+     * nr_vectors causes all these vectors being allocated on host.
+     *
+     * To keep allocation as needed, first setup vector 0 with an invalid
+     * fd to make MSI-X enabled, then enable vectors by setting all so that
+     * kernel allocates and enables interrupts only when enabled in guest.
+     */
+    if (msix && !(vdev->msix->irq_info_flags & VFIO_IRQ_INFO_NORESIZE)) {
+        argsz = sizeof(*irq_set) + sizeof(*fds);
+
+        irq_set = g_malloc0(argsz);
+        irq_set->argsz = argsz;
+        irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD |
+                         VFIO_IRQ_SET_ACTION_TRIGGER;
+        irq_set->index = msix ? VFIO_PCI_MSIX_IRQ_INDEX :
+                         VFIO_PCI_MSI_IRQ_INDEX;
+        irq_set->start = 0;
+        irq_set->count = 1;
+        fds = (int32_t *)&irq_set->data;
+        fds[0] = -1;
+
+        ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_SET_IRQS, irq_set);
+
+        g_free(irq_set);
+
+        if (ret) {
+            return ret;
+        }
+    }
+
     argsz = sizeof(*irq_set) + (vdev->nr_vectors * sizeof(*fds));
 
     irq_set = g_malloc0(argsz);
-- 
2.27.0


Reply via email to