Hi Michael,

On Sun, 01 Feb 2026 01:46:19 +0800,
Michael S. Tsirkin wrote:
> 
> On Fri, Jan 30, 2026 at 02:52:12PM -0600, Ira Weiny wrote:
> > Li Chen wrote:
> > > Under heavy concurrent flush traffic, virtio-pmem can overflow its request
> > > virtqueue (req_vq): virtqueue_add_sgs() starts returning -ENOSPC and the
> > > driver logs "no free slots in the virtqueue". Shortly after that the
> > > device enters VIRTIO_CONFIG_S_NEEDS_RESET and flush requests fail with
> > > "virtio pmem device needs a reset".
> > > 
> > > Serialize virtio_pmem_flush() with a per-device mutex so only one flush
> > > request is in-flight at a time. This prevents req_vq descriptor overflow
> > > under high concurrency.
> > > 
> > > Reproducer (guest with virtio-pmem):
> > >   - mkfs.ext4 -F /dev/pmem0
> > >   - mount -t ext4 -o dax,noatime /dev/pmem0 /mnt/bench
> > >   - fio: ioengine=io_uring rw=randwrite bs=4k iodepth=64 numjobs=64
> > >         direct=1 fsync=1 runtime=30s time_based=1
> > 
> > I don't see this error.
> > 
> > <file>
> > 13:28:50 > cat foo.fio 
> > # test http://lore.kernel.org/[email protected]
> > 
> > [global]
> > filename=/mnt/bench/foo
> > ioengine=io_uring
> > size=1G
> > bs=4K
> > iodepth=64
> > numjobs=64
> > direct=1
> > fsync=1
> > runtime=30s
> > time_based=1
> > 
> > [rand-write]
> > rw=randwrite
> > </file>
> > 
> > It's possible I'm doing something wrong.  Can you share your qemu cmdline
> > or more details on the bug yall see.
> > 
> > >   - dmesg: "no free slots in the virtqueue"
> > >            "virtio pmem device needs a reset"
> > > 
> > > Fixes: 6e84200c0a29 ("virtio-pmem: Add virtio pmem driver")
> > > Signed-off-by: Li Chen <[email protected]>
> > > ---
> > >  drivers/nvdimm/nd_virtio.c   | 15 +++++++++++----
> > >  drivers/nvdimm/virtio_pmem.c |  1 +
> > >  drivers/nvdimm/virtio_pmem.h |  4 ++++
> > >  3 files changed, 16 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/drivers/nvdimm/nd_virtio.c b/drivers/nvdimm/nd_virtio.c
> > > index c3f07be4aa22..827a17fe7c71 100644
> > > --- a/drivers/nvdimm/nd_virtio.c
> > > +++ b/drivers/nvdimm/nd_virtio.c
> > > @@ -44,19 +44,24 @@ static int virtio_pmem_flush(struct nd_region 
> > > *nd_region)
> > >   unsigned long flags;
> > >   int err, err1;
> > >  
> > > + might_sleep();
> 
> 
> for that matter might_sleep not really needed near mutex_lock.
> 
> 
> > > + mutex_lock(&vpmem->flush_lock);

Good point. mutex_lock() already does might_sleep(), so the explicit
might_sleep() next to the lock is redundant.

I'll drop it in v2 (which also switches to guard(mutex) as Ira suggested).

Regards,
Li

Reply via email to