On 01/03/2017 12:50, Halil Pasic wrote: > The commits 03de2f527 "virtio-blk: do not use vring in dataplane" and > 9ffe337c08 "virtio-blk: always use dataplane path if ioeventfd is active" > changed how notifications are done for virtio-blk substantially. Due to a > race condition interrupts are lost when irqfd is torn down after > notify_guest_bh was scheduled but before it actually runs.
I don't think the non-irqfd notification mechanism is thread safe, and that would be a problem for this patch. What is the path that causes the irqfd to be torn down? Only something like a reset should cause it (the only call in virtio-blk is from virtio_blk_data_plane_stop), and then the guest doesn't care anymore about interrupts. That path also does a qemu_bh_delete, so the notify_guest_bh should not be invoked at all. Paolo > Furthermore > virtio_notify_irqfd ignores the value returned by event_notifier_set > which correctly indicates that notification has failed due to bad file > descriptor. > > Let's fix this by making virtio_notify_irqfd fall back to the non-irqfd > notification mechanism if event_notifier_set fails. > > Signed-off-by: Halil Pasic <pa...@linux.vnet.ibm.com> > --- > > This is probably not the only way to fix this: suggestions welcome. I > did not use a fixes tag because I'm not sure yet where exactly things got > broken. Maybe guys more familiar with dataplane an coroutines can help > (Paolo, Stefan). > --- > hw/virtio/virtio.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c > index 23483c7..8e1c1e9 100644 > --- a/hw/virtio/virtio.c > +++ b/hw/virtio/virtio.c > @@ -1581,7 +1581,9 @@ void virtio_notify_irqfd(VirtIODevice *vdev, VirtQueue > *vq) > * to an atomic operation. > */ > virtio_set_isr(vq->vdev, 0x1); > - event_notifier_set(&vq->guest_notifier); > + if (event_notifier_set(&vq->guest_notifier)) { > + virtio_notify_vector(vdev, vq->vector); > + } > } > > static void virtio_irq(VirtQueue *vq) >