On Mon, Mar 02, 2026 at 03:51:49AM -0500, Michael S. Tsirkin wrote:
vhost_get_avail_idx is supposed to report whether it has updated
vq->avail_idx. Instead, it returns whether all entries have been
consumed, which is usually the same. But not always - in
drivers/vhost/net.c and when mergeable buffers have been enabled, the
driver checks whether the combined entries are big enough to store an
incoming packet. If not, the driver re-enables notifications with
available entries still in the ring. The incorrect return value from
vhost_get_avail_idx propagates through vhost_enable_notify and causes
the host to livelock if the guest is not making progress, as vhost will
immediately disable notifications and retry using the available entries.
Here I'd add something like this just to make it clear the full picture,
because I spent quite some time to understand how it was related to the
Fixes tag (which I agree is the right one to use).
This goes back to commit d3bb267bbdcb ("vhost: cache avail index in
vhost_enable_notify()") which changed vhost_enable_notify() to compare
the freshly read avail index against vq->last_avail_idx instead of the
previously cached vq->avail_idx. Commit 7ad472397667 ("vhost: move
smp_rmb() into vhost_get_avail_idx()") then carried over the same
comparison when refactoring vhost_enable_notify() to call the unified
vhost_get_avail_idx().
The obvious fix is to make vhost_get_avail_idx do what the comment
says it does and report whether new entries have been added.
Reported-by: ShuangYu <[email protected]>
Fixes: d3bb267bbdcb ("vhost: cache avail index in vhost_enable_notify()")
Cc: Stefano Garzarella <[email protected]>
Cc: Stefan Hajnoczi <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
---
Lightly tested, posting early to simplify testing for the reporter.
Tested with vhost-vsock and I didn't see any issue.
Thanks!
Reviewed-by: Stefano Garzarella <[email protected]>
drivers/vhost/vhost.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 2f2c45d20883..db329a6f6145 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -1522,6 +1522,7 @@ static void vhost_dev_unlock_vqs(struct vhost_dev *d)
static inline int vhost_get_avail_idx(struct vhost_virtqueue *vq)
{
__virtio16 idx;
+ u16 avail_idx;
int r;
r = vhost_get_avail(vq, idx, &vq->avail->idx);
@@ -1532,17 +1533,19 @@ static inline int vhost_get_avail_idx(struct
vhost_virtqueue *vq)
}
/* Check it isn't doing very strange thing with available indexes */
- vq->avail_idx = vhost16_to_cpu(vq, idx);
- if (unlikely((u16)(vq->avail_idx - vq->last_avail_idx) > vq->num)) {
+ avail_idx = vhost16_to_cpu(vq, idx);
+ if (unlikely((u16)(avail_idx - vq->last_avail_idx) > vq->num)) {
vq_err(vq, "Invalid available index change from %u to %u",
- vq->last_avail_idx, vq->avail_idx);
+ vq->last_avail_idx, avail_idx);
return -EINVAL;
}
/* We're done if there is nothing new */
- if (vq->avail_idx == vq->last_avail_idx)
+ if (avail_idx == vq->avail_idx)
return 0;
+ vq->avail_idx = avail_idx;
+
/*
* We updated vq->avail_idx so we need a memory barrier between
* the index read above and the caller reading avail ring entries.
--
MST