On Mon, Mar 15, 2021 at 07:41:32PM +0300, Vladimir Sementsov-Ogievskiy wrote: > 15.03.2021 09:06, Roman Kagan wrote: > > Document (via a comment and an assert) that > > nbd_client_detach_aio_context and nbd_client_attach_aio_context_bh run > > in the desired aio_context > > > > Signed-off-by: Roman Kagan <rvka...@yandex-team.ru> > > --- > > block/nbd.c | 12 ++++++++++++ > > 1 file changed, 12 insertions(+) > > > > diff --git a/block/nbd.c b/block/nbd.c > > index 1d8edb5b21..658b827d24 100644 > > --- a/block/nbd.c > > +++ b/block/nbd.c > > @@ -241,6 +241,12 @@ static void > > nbd_client_detach_aio_context(BlockDriverState *bs) > > { > > BDRVNBDState *s = (BDRVNBDState *)bs->opaque; > > + /* > > + * This runs in the (old, about to be detached) aio context of the @bs > > so > > + * accessing the stuff on @s is concurrency-free. > > + */ > > + assert(qemu_get_current_aio_context() == bdrv_get_aio_context(bs)); > > Hmm. I don't think so. The handler is called from > bdrv_set_aio_context_ignore(), which have the assertion > g_assert(qemu_get_current_aio_context() == qemu_get_aio_context());. There is > also a comment above bdrv_set_aio_context_ignore() "The caller must own the > AioContext lock for the old AioContext of bs". > > So, we are not in the home context of bs here. We are in the main aio context > and we hold AioContext lock of old bs context.
You're absolutely right. I'm wondering where I got the idea of this assertion from... > > > + > > /* Timer is deleted in nbd_client_co_drain_begin() */ > > assert(!s->reconnect_delay_timer); > > /* > > @@ -258,6 +264,12 @@ static void nbd_client_attach_aio_context_bh(void > > *opaque) > > BlockDriverState *bs = opaque; > > BDRVNBDState *s = (BDRVNBDState *)bs->opaque; > > + /* > > + * This runs in the (new, just attached) aio context of the @bs so > > + * accessing the stuff on @s is concurrency-free. > > + */ > > + assert(qemu_get_current_aio_context() == bdrv_get_aio_context(bs)); > > This is correct just because we are in a BH, scheduled for this > context (I hope we can't reattach some third context prior to entering > the BH in the second:). So, I don't think this assertion really adds > something. Indeed. > > + > > if (s->connection_co) { > > /* > > * The node is still drained, so we know the coroutine has > > yielded in > > > > I'm not sure that the asserted fact gives us "concurrency-free". For > this we also need to ensure that all other things in the driver are > always called in same aio context.. Still, it's a general assumption > we have when writing block drivers "everything in one aio context, so > don't care".. Sometime it leads to bugs, as some things are still > called even from non-coroutine context. Also, Paolo said > (https://lists.gnu.org/archive/html/qemu-devel/2020-08/msg03814.html) > that many iothreads will send requests to bs, and the code in driver > should be thread safe. I don't have good understanding of all these > things, and what I have is: > > For now (at least we don't have problems in Rhel based downstream) it > maybe OK to think that in block-driver everything is protected by > AioContext lock and all concurrency we have inside block driver is > switching between coroutines but never real parallelism. But in > general it's not so, and with multiqueue it's not so.. So, I'd not put > such a comment :) So the patch is bogus in every respect; let's just drop it. I hope it doesn't invalidate completely the rest of the series though. Meanwhile I certainly need to update my idea of concurrency assumptions in the block layer. Thanks, Roman.