When we check if a driver needs a signal, we compare:

- used_event: written by the driver each time it consumes an item
- new: current idx written to the used ring, updated by us
- old: last idx we signaled about

We call vring_need_event() which does:

    return (__u16)(new_idx - event_idx - 1) < (__u16)(new_idx - old);

Previously we updated signalled_used on every check, so old was always
new - 1. Because used_event cannot bigger than new_idx, this check
becomes (ignoring wrapping):

    return new_idx == event_idx + 1;

Since the driver consumes items at the same time the device produces
items, it is very likely (and seen in logs) that the driver used_event
is too far behind new_idx and we don't signal the driver.

With libblkio virtio-blk-vhost-user driver, if the driver does not get a
signal, the libblkio client can hang polling the completion fd. This
is very easy to reproduce on some machines and impossible to reproduce
on others.

Fixed by updating signalled_used only when we signal the driver.
Tested using blkio-bench and libblkio client application that used to
hang randomly without this change.

Buglink: https://gitlab.com/libblkio/libblkio/-/issues/68
Signed-off-by: Nir Soffer <nsof...@redhat.com>
---
 subprojects/libvhost-user/libvhost-user.c | 23 +++++++++++++++++------
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/subprojects/libvhost-user/libvhost-user.c 
b/subprojects/libvhost-user/libvhost-user.c
index 8fb61e2df2..5f26d2d378 100644
--- a/subprojects/libvhost-user/libvhost-user.c
+++ b/subprojects/libvhost-user/libvhost-user.c
@@ -2382,12 +2382,11 @@ vu_queue_empty(VuDev *dev, VuVirtq *vq)
 }
 
 static bool
 vring_notify(VuDev *dev, VuVirtq *vq)
 {
-    uint16_t old, new;
-    bool v;
+    uint16_t old, new, used;
 
     /* We need to expose used array entries before checking used event. */
     smp_mb();
 
     /* Always notify when queue is empty (when feature acknowledge) */
@@ -2398,15 +2397,27 @@ vring_notify(VuDev *dev, VuVirtq *vq)
 
     if (!vu_has_feature(dev, VIRTIO_RING_F_EVENT_IDX)) {
         return !(vring_avail_flags(vq) & VRING_AVAIL_F_NO_INTERRUPT);
     }
 
-    v = vq->signalled_used_valid;
-    vq->signalled_used_valid = true;
+    if (!vq->signalled_used_valid) {
+        vq->signalled_used_valid = true;
+        vq->signalled_used = vq->used_idx;
+        return true;
+    }
+
+    used = vring_get_used_event(vq);
+    new = vq->used_idx;
     old = vq->signalled_used;
-    new = vq->signalled_used = vq->used_idx;
-    return !v || vring_need_event(vring_get_used_event(vq), new, old);
+
+    if (vring_need_event(used, new, old)) {
+        vq->signalled_used_valid = true;
+        vq->signalled_used = vq->used_idx;
+        return true;
+    }
+
+    return false;
 }
 
 static void _vu_queue_notify(VuDev *dev, VuVirtq *vq, bool sync)
 {
     if (unlikely(dev->broken) ||
-- 
2.40.1


Reply via email to