async: aio_co_enter(): do aio_co_schedule in general case

Roman Kagan Thu, 08 Apr 2021 08:55:38 -0700

On Thu, Apr 08, 2021 at 05:08:20PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> With the following patch we want to call aio_co_wake() from thread.
> And it works bad.
> Assume we have no iothreads.
> Assume we have a coroutine A, which waits in the yield point for external
> aio_co_wake(), and no progress can be done until it happen.
> Main thread is in blocking aio_poll() (for example, in blk_read()).
> 
> Now, in a separate thread we do aio_co_wake(). It calls  aio_co_enter(),
> which goes through last "else" branch and do aio_context_acquire(ctx).
> 
> Now we have a deadlock, as aio_poll() will not release the context lock
> until some progress is done, and progress can't be done until
> aio_co_wake() wake the coroutine A. And it can't because it wait for
> aio_context_acquire().
> 
> Still, aio_co_schedule() works well in parallel with blocking
> aio_poll(). So let's use it in generic case and drop
> aio_context_acquire/aio_context_release branch from aio_co_enter().
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsement...@virtuozzo.com>
> ---
>  util/async.c | 11 ++---------
>  1 file changed, 2 insertions(+), 9 deletions(-)
> 
> diff --git a/util/async.c b/util/async.c
> index 674dbefb7c..f05b883a39 100644
> --- a/util/async.c
> +++ b/util/async.c
> @@ -614,19 +614,12 @@ void aio_co_wake(struct Coroutine *co)
>  
>  void aio_co_enter(AioContext *ctx, struct Coroutine *co)
>  {
> -    if (ctx != qemu_get_current_aio_context()) {
> -        aio_co_schedule(ctx, co);
> -        return;
> -    }
> -
> -    if (qemu_in_coroutine()) {
> +    if (ctx == qemu_get_current_aio_context() && qemu_in_coroutine()) {
>          Coroutine *self = qemu_coroutine_self();
>          assert(self != co);
>          QSIMPLEQ_INSERT_TAIL(&self->co_queue_wakeup, co, co_queue_next);
>      } else {
> -        aio_context_acquire(ctx);
> -        qemu_aio_coroutine_enter(ctx, co);
> -        aio_context_release(ctx);
> +        aio_co_schedule(ctx, co);
>      }
>  }


I'm fine with the change, but I find the log message to be a bit
confusing (although correct).  AFAICS the problem is that calling
aio_co_enter from a thread which has no associated aio_context works
differently compared to calling it from a proper iothread: if the target
context was qemu_aio_context, an iothread would just schedule the
coroutine there, while a "dumb" thread would try lock the context
potentially resulting in a deadlock.  This patch makes "dumb" threads
and iothreads behave identically when entering a coroutine on a foreign
context.

You may want to rephrase the log message to that end.

Anyway

Reviewed-by: Roman Kagan <rvka...@yandex-team.ru>

Re: [PATCH v2 03/10] util/async: aio_co_enter(): do aio_co_schedule in general case

Reply via email to