On Mon, 05/08 14:07, Stefan Hajnoczi wrote: > The main loop uses aio_disable_external()/aio_enable_external() to > temporarily disable processing of external AioContext clients like > device emulation. > > This allows monitor commands to quiesce I/O and prevent the guest from > submitting new requests while a monitor command is in progress. > > The aio_enable_external() API is currently broken when an IOThread is in > aio_poll() waiting for fd activity when the main loop re-enables > external clients. Incrementing ctx->external_disable_cnt does not wake > the IOThread from ppoll(2) so fd processing remains suspended and leads > to unresponsive emulated devices. > > This patch adds an aio_notify() call to aio_enable_external() so the > IOThread is kicked out of ppoll(2) and will re-arm the file descriptors. > > The bug can be reproduced as follows: > > $ qemu -M accel=kvm -m 1024 \ > -object iothread,id=iothread0 \ > -device virtio-scsi-pci,iothread=iothread0,id=virtio-scsi-pci0 \ > -drive > if=none,id=drive0,aio=native,cache=none,format=raw,file=test.img \ > -device scsi-hd,id=scsi-hd0,drive=drive0 \ > -qmp tcp::5555,server,nowait > > $ scripts/qmp/qmp-shell localhost:5555 > (qemu) blockdev-snapshot-sync device=drive0 snapshot-file=sn1.qcow2 > mode=absolute-paths format=qcow2 > > After blockdev-snapshot-sync completes the SCSI disk will be > unresponsive. This leads to request timeouts inside the guest. > > Reported-by: Qianqian Zhu <qi...@redhat.com> > Suggested-by: Fam Zheng <f...@redhat.com> > Signed-off-by: Stefan Hajnoczi <stefa...@redhat.com> > --- > v3: > * s/dec_fetch/fetch_dec/ [Fam] > --- > include/block/aio.h | 10 ++++++++-- > 1 file changed, 8 insertions(+), 2 deletions(-) > > diff --git a/include/block/aio.h b/include/block/aio.h > index 406e323..e9aeeae 100644 > --- a/include/block/aio.h > +++ b/include/block/aio.h > @@ -454,8 +454,14 @@ static inline void aio_disable_external(AioContext *ctx) > */ > static inline void aio_enable_external(AioContext *ctx) > { > - assert(ctx->external_disable_cnt > 0); > - atomic_dec(&ctx->external_disable_cnt); > + int old; > + > + old = atomic_fetch_dec(&ctx->external_disable_cnt); > + assert(old > 0); > + if (old == 1) { > + /* Kick event loop so it re-arms file descriptors */ > + aio_notify(ctx); > + } > } > > /** > -- > 2.9.3 >
The patchew failure doesn't seem to relate to this patch, at least I cannot reproduce it. The patch looks good to me now! Reviewed-by: Fam Zheng <f...@redhat.com>