detach runs in the proper context

Roman Kagan Mon, 15 Mar 2021 13:00:04 -0700

On Mon, Mar 15, 2021 at 07:41:32PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> 15.03.2021 09:06, Roman Kagan wrote:
> > Document (via a comment and an assert) that
> > nbd_client_detach_aio_context and nbd_client_attach_aio_context_bh run
> > in the desired aio_context
> > 
> > Signed-off-by: Roman Kagan <rvka...@yandex-team.ru>
> > ---
> >   block/nbd.c | 12 ++++++++++++
> >   1 file changed, 12 insertions(+)
> > 
> > diff --git a/block/nbd.c b/block/nbd.c
> > index 1d8edb5b21..658b827d24 100644
> > --- a/block/nbd.c
> > +++ b/block/nbd.c
> > @@ -241,6 +241,12 @@ static void 
> > nbd_client_detach_aio_context(BlockDriverState *bs)
> >   {
> >       BDRVNBDState *s = (BDRVNBDState *)bs->opaque;
> > +    /*
> > +     * This runs in the (old, about to be detached) aio context of the @bs 
> > so
> > +     * accessing the stuff on @s is concurrency-free.
> > +     */
> > +    assert(qemu_get_current_aio_context() == bdrv_get_aio_context(bs));
> 
> Hmm. I don't think so. The handler is called from 
> bdrv_set_aio_context_ignore(), which have the assertion 
> g_assert(qemu_get_current_aio_context() == qemu_get_aio_context());. There is 
> also a comment above bdrv_set_aio_context_ignore() "The caller must own the 
> AioContext lock for the old AioContext of bs".
> 
> So, we are not in the home context of bs here. We are in the main aio context 
> and we hold AioContext lock of old bs context.


You're absolutely right.  I'm wondering where I got the idea of this
assertion from...

> 
> > +
> >       /* Timer is deleted in nbd_client_co_drain_begin() */
> >       assert(!s->reconnect_delay_timer);
> >       /*
> > @@ -258,6 +264,12 @@ static void nbd_client_attach_aio_context_bh(void 
> > *opaque)
> >       BlockDriverState *bs = opaque;
> >       BDRVNBDState *s = (BDRVNBDState *)bs->opaque;
> > +    /*
> > +     * This runs in the (new, just attached) aio context of the @bs so
> > +     * accessing the stuff on @s is concurrency-free.
> > +     */
> > +    assert(qemu_get_current_aio_context() == bdrv_get_aio_context(bs));
> 
> This is correct just because we are in a BH, scheduled for this
> context (I hope we can't reattach some third context prior to entering
> the BH in the second:). So, I don't think this assertion really adds
> something.

Indeed.

> > +
> >       if (s->connection_co) {
> >           /*
> >            * The node is still drained, so we know the coroutine has 
> > yielded in
> > 
> 
> I'm not sure that the asserted fact gives us "concurrency-free". For
> this we also need to ensure that all other things in the driver are
> always called in same aio context.. Still, it's a general assumption
> we have when writing block drivers "everything in one aio context, so
> don't care".. Sometime it leads to bugs, as some things are still
> called even from non-coroutine context. Also, Paolo said
> (https://lists.gnu.org/archive/html/qemu-devel/2020-08/msg03814.html)
> that many iothreads will send requests to bs, and the code in driver
> should be thread safe. I don't have good understanding of all these
> things, and what I have is:
> 
> For now (at least we don't have problems in Rhel based downstream) it
> maybe OK to think that in block-driver everything is protected by
> AioContext lock and all concurrency we have inside block driver is
> switching between coroutines but never real parallelism. But in
> general it's not so, and with multiqueue it's not so.. So, I'd not put
> such a comment :)

So the patch is bogus in every respect; let's just drop it.

I hope it doesn't invalidate completely the rest of the series though.

Meanwhile I certainly need to update my idea of concurrency assumptions
in the block layer.

Thanks,
Roman.

Re: [PATCH 3/7] block/nbd: assert attach/detach runs in the proper context

Reply via email to