RE: [PATCH] QIO: Add force_quit to the QIOChannel to ensure QIO exits cleanly in some corner case

Rao, Lei Wed, 01 Dec 2021 01:49:48 -0800


-----Original Message-----
From: Daniel P. Berrangé <berra...@redhat.com> 
Sent: Wednesday, December 1, 2021 5:11 PM
To: Rao, Lei <lei....@intel.com>
Cc: Zhang, Chen <chen.zh...@intel.com>; ebl...@redhat.com; 
vsement...@virtuozzo.com; kw...@redhat.com; hre...@redhat.com; 
qemu-bl...@nongnu.org; qemu-devel@nongnu.org
Subject: Re: [PATCH] QIO: Add force_quit to the QIOChannel to ensure QIO exits 
cleanly in some corner case


On Wed, Dec 01, 2021 at 03:54:27PM +0800, Rao, Lei wrote:
>     We found that the QIO channel coroutine could not be awakened in some 
> corner cases during our stress test for COLO.
>     The patch fixes as follow:
>         #0  0x00007fad72e24bf6 in __ppoll (fds=0x5563d75861f0, nfds=1, 
> timeout=<optimized out>, sigmask=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:44
>         #1  0x00005563d6977c68 in qemu_poll_ns (fds=0x5563d75861f0, nfds=1, 
> timeout=599999697630) at util/qemu-timer.c:348
>         #2  0x00005563d697ac44 in aio_poll (ctx=0x5563d755dfa0, 
> blocking=true) at util/aio-posix.c:669
>         #3  0x00005563d68ba24f in bdrv_do_drained_begin (bs=0x5563d75ea0c0, 
> recursive=false, parent=0x0, ignore_bds_parents=false, poll=true) at 
> block/io.c:432
>         #4  0x00005563d68ba338 in bdrv_drained_begin (bs=0x5563d75ea0c0) at 
> block/io.c:438
>         #5  0x00005563d689c6d2 in quorum_del_child (bs=0x5563d75ea0c0, 
> child=0x5563d7908600, errp=0x7fff3cf5b960) at block/quorum.c:1063
>         #6  0x00005563d684328f in bdrv_del_child (parent_bs=0x5563d75ea0c0, 
> child=0x5563d7908600, errp=0x7fff3cf5b960) at block.c:6568
>         #7  0x00005563d658499e in qmp_x_blockdev_change 
> (parent=0x5563d80ec4c0 "colo-disk0", has_child=true, child=0x5563d7577d30 
> "children.1", has_node=false, node=0x0,errp=0x7fff3cf5b960) at blockdev.c:4494
>         #8  0x00005563d67e8b4e in qmp_marshal_x_blockdev_change 
> (args=0x7fad6400ada0, ret=0x7fff3cf5b9f8, errp=0x7fff3cf5b9f0) at 
> qapi/qapi-commands-block-core.c:1538
>         #9  0x00005563d691cd9e in do_qmp_dispatch (cmds=0x5563d719a4f0 
> <qmp_commands>, request=0x7fad64009d80, allow_oob=true, errp=0x7fff3cf5ba98)
>             at qapi/qmp-dispatch.c:132
>         #10 0x00005563d691cfab in qmp_dispatch (cmds=0x5563d719a4f0 
> <qmp_commands>, request=0x7fad64009d80, allow_oob=true) at 
> qapi/qmp-dispatch.c:175
>         #11 0x00005563d67b7787 in monitor_qmp_dispatch (mon=0x5563d7605d40, 
> req=0x7fad64009d80) at monitor/qmp.c:145
>         #12 0x00005563d67b7b24 in monitor_qmp_bh_dispatcher (data=0x0) at 
> monitor/qmp.c:234
>         #13 0x00005563d69754c2 in aio_bh_call (bh=0x5563d7473230) at 
> util/async.c:89
>         #14 0x00005563d697555e in aio_bh_poll (ctx=0x5563d7471da0) at 
> util/async.c:117
>         #15 0x00005563d697a423 in aio_dispatch (ctx=0x5563d7471da0) at 
> util/aio-posix.c:459
>         #16 0x00005563d6975917 in aio_ctx_dispatch (source=0x5563d7471da0, 
> callback=0x0, user_data=0x0) at util/async.c:260
>         #17 0x00007fad730e1fbd in g_main_context_dispatch () from 
> /lib/x86_64-linux-gnu/libglib-2.0.so.0
>         #18 0x00005563d6978cda in glib_pollfds_poll () at util/main-loop.c:219
>         #19 0x00005563d6978d58 in os_host_main_loop_wait (timeout=977650) at 
> util/main-loop.c:242
>         #20 0x00005563d6978e69 in main_loop_wait (nonblocking=0) at 
> util/main-loop.c:518
>         #21 0x00005563d658f5ed in main_loop () at vl.c:1814
>         #22 0x00005563d6596bb7 in main (argc=61, argv=0x7fff3cf5c0c8, 
> envp=0x7fff3cf5c2b8) at vl.c:450
> 
>     From the call trace, we can see that the QEMU main thread is waiting for 
> the in_flight return to zero. But the in_filght never reaches 0.
>     After analysis, the root cause is that the coroutine of NBD was not 
> awakened. Although this bug was found in the COLO test, I think this is a
>     universal problem in the QIO module. This issue also affects other 
> modules depending on QIO such as NBD. We dump the following data:
>     $2 = {
>       in_flight = 2,
>       state = NBD_CLIENT_QUIT,
>       connect_status = 0,
>       connect_err = 0x0,
>       wait_in_flight = false,
>       requests = {{
>           coroutine = 0x0,
>           offset = 273077071872,
>           receiving = false,
>         }, {
>           coroutine = 0x7f1164571df0,
>           offset = 498792529920,
>           receiving = false,
>         }, {
>           coroutine = 0x0,
>           offset = 500663590912,
>           receiving = false,
>         }, {
>           ...
>         } },
>       reply = {
>         simple = {
>           magic = 1732535960,
>           error = 0,
>           handle = 0
>         },
>         structured = {
>           magic = 1732535960,
>           flags = 0,
>           type = 0,
>           handle = 0,
>           length = 0
>         },
>         {
>           magic = 1732535960,
>           _skip = 0,
>           handle = 0
>         }
>       },
>       bs = 0x7f11640e2140,
>       reconnect_delay = 0,
>       saddr = 0x7f11640e1f80,
>       export = 0x7f11640dc560 "parent0",
>     }
>     From the data, we can see the coroutine of NBD does not exit normally 
> when killing the NBD server(we kill the Secondary VM in the COLO stress test).
>     Then it will not execute in_flight--. So, the QEMU main thread will hang 
> here. Further analysis, I found the coroutine of NBD will yield
>     in nbd_send_request() or qio_channel_write_all() in nbd_co_send_request. 
> By the way, the yield is due to the kernel return EAGAIN(under the stress 
> test).
>     However, NBD didn't know it would yield here. So, the 
> nbd_recv_coroutines_wake won't wake it up because req->receiving is false. if 
> the NBD server
>     is terminated abnormally at the same time. The G_IO_OUT will be invalid, 
> the coroutine will never be awakened. In addition, the s->ioc will be released
>     immediately. if we force to wake up the coroutine of NBD, access to the 
> ioc->xxx will cause segment fault. Finally, I add a state named force_quit to
>     the QIOChannel to ensure that QIO will exit immediately. And I add the 
> function of qio_channel_coroutines_wake to wake it up.
> 
> Signed-off-by: Lei Rao <lei....@intel.com>
> Signed-off-by: Zhang Chen <chen.zh...@intel.com>
> ---
>  block/nbd.c          |  5 +++++
>  include/io/channel.h | 19 +++++++++++++++++++
>  io/channel.c         | 22 ++++++++++++++++++++++
>  3 files changed, 46 insertions(+)
> 
> diff --git a/block/nbd.c b/block/nbd.c index 5ef462db1b..5ee4eaaf57 
> 100644
> --- a/block/nbd.c
> +++ b/block/nbd.c
> @@ -208,6 +208,8 @@ static void nbd_teardown_connection(BlockDriverState *bs)
>      assert(!s->in_flight);
>  
>      if (s->ioc) {
> +        qio_channel_set_force_quit(s->ioc, true);
> +        qio_channel_coroutines_wake(s->ioc);
>          qio_channel_shutdown(s->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, 
> NULL);

Calling shutdown here should already be causing the qio_chanel_readv_all to 
wakeup and break out of its
poll() loop. We shouldn't need to add a second way to break out of the poll().

Calling shutdown can wake up the coroutines which is polling. But I think it's 
not enough. I tried to forcibly wake up the NBD coroutine,
It may cause segment fault. The root cause is that it will continue to access 
ioc->xxx in qio_channel_yield(), but the ioc has been released and set it NULL 
such as in
nbd_co_do_establish_connection(); I think call shutdown will have the same 
result. So, I add the force_quit, once set it true, it will immediately exit 
without accessing IOC.

Thanks
Lei

>          yank_unregister_function(BLOCKDEV_YANK_INSTANCE(s->bs->node_name),
>                                   nbd_yank, s->bs); @@ -315,6 +317,7 
> @@ int coroutine_fn nbd_co_do_establish_connection(BlockDriverState 
> *bs,
>  
>      /* successfully connected */
>      s->state = NBD_CLIENT_CONNECTED;
> +    qio_channel_set_force_quit(s->ioc, false);
>      qemu_co_queue_restart_all(&s->free_sema);
>  
>      return 0;
> @@ -345,6 +348,8 @@ static coroutine_fn void 
> nbd_reconnect_attempt(BDRVNBDState *s)
>  
>      /* Finalize previous connection if any */
>      if (s->ioc) {
> +        qio_channel_set_force_quit(s->ioc, true);
> +        qio_channel_coroutines_wake(s->ioc);

Isn't this code path just missing a qio_channel_shutdown call or a 
qio_channel_close call to make the socket trigger wakeup from poll.

I don't think it can solve the bug even if it is added, the reason is as above.
Thanks,
Lei

>          qio_channel_detach_aio_context(QIO_CHANNEL(s->ioc));
>          yank_unregister_function(BLOCKDEV_YANK_INSTANCE(s->bs->node_name),
>                                   nbd_yank, s->bs); diff --git 
> a/include/io/channel.h b/include/io/channel.h index 
> 88988979f8..bc5af8cdd6 100644
> --- a/include/io/channel.h
> +++ b/include/io/channel.h
> @@ -78,6 +78,7 @@ struct QIOChannel {
>      AioContext *ctx;
>      Coroutine *read_coroutine;
>      Coroutine *write_coroutine;
> +    bool force_quit;
>  #ifdef _WIN32
>      HANDLE event; /* For use with GSource on Win32 */  #endif @@ 
> -484,6 +485,24 @@ int qio_channel_set_blocking(QIOChannel *ioc,
>                               bool enabled,
>                               Error **errp);
>  
> +/**
> + * qio_channel_force_quit:
> + * @ioc: the channel object
> + * @quit: the new flag state
> + *
> + * Set the new flag state
> + */
> +void qio_channel_set_force_quit(QIOChannel *ioc, bool quit);
> +
> +/**
> + * qio_channel_coroutines_wake:
> + * @ioc: the channel object
> + *
> + * Wake up the coroutines to ensure that they will exit normally
> + * when the server terminated abnormally  */ void 
> +qio_channel_coroutines_wake(QIOChannel *ioc);
> +
>  /**
>   * qio_channel_close:
>   * @ioc: the channel object
> diff --git a/io/channel.c b/io/channel.c index e8b019dc36..bc1a9e055c 
> 100644
> --- a/io/channel.c
> +++ b/io/channel.c
> @@ -136,6 +136,9 @@ int qio_channel_readv_full_all_eof(QIOChannel *ioc,
>          if (len == QIO_CHANNEL_ERR_BLOCK) {
>              if (qemu_in_coroutine()) {
>                  qio_channel_yield(ioc, G_IO_IN);
> +                if (ioc->force_quit) {
> +                    goto cleanup;
> +                }
>              } else {
>                  qio_channel_wait(ioc, G_IO_IN);
>              }
> @@ -242,6 +245,9 @@ int qio_channel_writev_full_all(QIOChannel *ioc,
>          if (len == QIO_CHANNEL_ERR_BLOCK) {
>              if (qemu_in_coroutine()) {
>                  qio_channel_yield(ioc, G_IO_OUT);
> +                if (ioc->force_quit) {
> +                    goto cleanup;
> +                }
>              } else {
>                  qio_channel_wait(ioc, G_IO_OUT);
>              }
> @@ -543,6 +549,9 @@ void coroutine_fn qio_channel_yield(QIOChannel *ioc,
>      }
>      qio_channel_set_aio_fd_handlers(ioc);
>      qemu_coroutine_yield();
> +    if (ioc->force_quit) {
> +        return;
> +    }
>  
>      /* Allow interrupting the operation by reentering the coroutine other 
> than
>       * through the aio_fd_handlers. */ @@ -555,6 +564,19 @@ void 
> coroutine_fn qio_channel_yield(QIOChannel *ioc,
>      }
>  }
>  
> +void qio_channel_set_force_quit(QIOChannel *ioc, bool quit) {
> +    ioc->force_quit = quit;
> +}
> +
> +void qio_channel_coroutines_wake(QIOChannel *ioc) {
> +    if (ioc->read_coroutine) {
> +        qio_channel_restart_read(ioc);
> +    } else if (ioc->write_coroutine) {
> +        qio_channel_restart_write(ioc);
> +    }
> +}

None of this should be needed AFAICT, as the poll/coroutine code shuld already 
break out of poll if the socket is closed, or is marked shutdown.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

RE: [PATCH] QIO: Add force_quit to the QIOChannel to ensure QIO exits cleanly in some corner case

Reply via email to