When we check if a driver needs a signal, we compare: - used_event: written by the driver each time it consumes an item - new: current idx written to the used ring, updated by us - old: last idx we signaled about
We call vring_need_event() which does: return (__u16)(new_idx - event_idx - 1) < (__u16)(new_idx - old); Previously we updated signalled_used on every check, so old was always new - 1. Because used_event cannot bigger than new_idx, this check becomes (ignoring wrapping): return new_idx == event_idx + 1; Since the driver consumes items at the same time the device produces items, it is very likely (and seen in logs) that the driver used_event is too far behind new_idx and we don't signal the driver. With libblkio virtio-blk-vhost-user driver, if the driver does not get a signal, the libblkio client can hang polling the completion fd. This is very easy to reproduce on some machines and impossible to reproduce on others. Fixed by updating signalled_used only when we signal the driver. Tested using blkio-bench and libblkio client application that used to hang randomly without this change. Buglink: https://gitlab.com/libblkio/libblkio/-/issues/68 Signed-off-by: Nir Soffer <nsof...@redhat.com> --- subprojects/libvhost-user/libvhost-user.c | 23 +++++++++++++++++------ 1 file changed, 17 insertions(+), 6 deletions(-) diff --git a/subprojects/libvhost-user/libvhost-user.c b/subprojects/libvhost-user/libvhost-user.c index 8fb61e2df2..5f26d2d378 100644 --- a/subprojects/libvhost-user/libvhost-user.c +++ b/subprojects/libvhost-user/libvhost-user.c @@ -2382,12 +2382,11 @@ vu_queue_empty(VuDev *dev, VuVirtq *vq) } static bool vring_notify(VuDev *dev, VuVirtq *vq) { - uint16_t old, new; - bool v; + uint16_t old, new, used; /* We need to expose used array entries before checking used event. */ smp_mb(); /* Always notify when queue is empty (when feature acknowledge) */ @@ -2398,15 +2397,27 @@ vring_notify(VuDev *dev, VuVirtq *vq) if (!vu_has_feature(dev, VIRTIO_RING_F_EVENT_IDX)) { return !(vring_avail_flags(vq) & VRING_AVAIL_F_NO_INTERRUPT); } - v = vq->signalled_used_valid; - vq->signalled_used_valid = true; + if (!vq->signalled_used_valid) { + vq->signalled_used_valid = true; + vq->signalled_used = vq->used_idx; + return true; + } + + used = vring_get_used_event(vq); + new = vq->used_idx; old = vq->signalled_used; - new = vq->signalled_used = vq->used_idx; - return !v || vring_need_event(vring_get_used_event(vq), new, old); + + if (vring_need_event(used, new, old)) { + vq->signalled_used_valid = true; + vq->signalled_used = vq->used_idx; + return true; + } + + return false; } static void _vu_queue_notify(VuDev *dev, VuVirtq *vq, bool sync) { if (unlikely(dev->broken) || -- 2.40.1