Hi Stefan, Alex. On 9/10/20 12:44 PM, Stefan Hajnoczi wrote: > On Wed, Sep 09, 2020 at 04:23:53PM +0200, Philippe Mathieu-Daudé wrote: >> +/** >> + * Initialize device MSIX IRQs and register event notifiers. >> + * @irq_count: pointer to number of MSIX IRQs to initialize >> + * @notifier: Array of @irq_count notifiers (each corresponding to a MSIX >> IRQ) >> + >> + * If the number of IRQs requested exceeds the available on the device, >> + * store the number of available IRQs in @irq_count and return -EOVERFLOW. >> + */ >> +int qemu_vfio_pci_init_msix_irqs(QEMUVFIOState *s, EventNotifier *notifier, >> + unsigned *irq_count, Error **errp) >> +{ >> + int r; >> + size_t irq_set_size; >> + struct vfio_irq_set *irq_set; >> + struct vfio_irq_info irq_info = { >> + .argsz = sizeof(irq_info), >> + .index = VFIO_PCI_MSIX_IRQ_INDEX >> + }; >> + >> + if (ioctl(s->device, VFIO_DEVICE_GET_IRQ_INFO, &irq_info)) { >> + error_setg_errno(errp, errno, "Failed to get device interrupt >> info"); >> + return -errno; >> + } >> + if (irq_info.count < *irq_count) { >> + error_setg(errp, "Not enough device interrupts available"); >> + *irq_count = irq_info.count; >> + return -EOVERFLOW; >> + } >> + if (!(irq_info.flags & VFIO_IRQ_INFO_EVENTFD)) { >> + error_setg(errp, "Device interrupt doesn't support eventfd"); >> + return -EINVAL; >> + } >> + >> + irq_set_size = sizeof(*irq_set) + *irq_count * sizeof(int32_t); >> + irq_set = g_malloc0(irq_set_size); >> + >> + /* Get to a known IRQ state */ >> + *irq_set = (struct vfio_irq_set) { >> + .argsz = irq_set_size, >> + .flags = VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_TRIGGER, >> + .index = irq_info.index, >> + .start = 0, >> + .count = *irq_count, >> + }; >> + >> + for (unsigned i = 0; i < *irq_count; i++) { >> + ((int32_t *)&irq_set->data)[i] = >> event_notifier_get_fd(¬ifier[i]); >> + } >> + r = ioctl(s->device, VFIO_DEVICE_SET_IRQS, irq_set); >> + g_free(irq_set); >> + if (r <= 0) { >> + error_setg_errno(errp, errno, "Failed to setup device interrupts"); >> + return -errno; >> + } else if (r < *irq_count) { >> + error_setg(errp, "Not enough device interrupts available"); >> + *irq_count = r; >> + return -EOVERFLOW; >> + } > > EOVERFLOW can occur in two cases: VFIO_DEVICE_GET_IRQ_INFO and > VFIO_DEVICE_SET_IRQS.
Yes. > > If it happens in the second case the notifier[] array has been > registered successfully. No, I don't think so: vfio_pci_set_msi_trigger() register the notifier only if vfio_msi_enable() succeeded (returned 0). If vfio_msi_enable() failed it returns the number of vectors available but do not register the notifiers. Alex, do you confirm? > > The caller has no way of distinguishing the two cases. Therefore the > caller doesn't know if the eventfds will be used by the kernel after > EOVERFLOW. > > If the second case can ever happen then this function should probably > call VFIO_DEVICE_SET_IRQS again with VFIO_IRQ_SET_DATA_NONE to > unregister the eventfds before returning EOVERFLOW. > > STefan >