Am 25.05.2023 um 20:51 hat Stefan Hajnoczi geschrieben: > On Thu, May 25, 2023 at 02:47:07PM +0200, Kevin Wolf wrote: > > qcow2_open() doesn't work correctly when opening the 'file' child moves > > bs to an iothread, for several reasons: > > > > - It uses BDRV_POLL_WHILE() to wait for the qcow2_open_entry() > > coroutine, which involves dropping the AioContext lock for bs when it > > is not in the main context - but we don't hold it, so this crashes. > > > > - It runs the qcow2_open_entry() coroutine in the current thread instead > > of the new AioContext of bs. > > > > - qcow2_open_entry() doesn't notify the main loop when it's done. > > > > This patches fixes these issues around delegating work to a coroutine. > > Temporarily dropping the main AioContext lock is not necessary because > > we know we run in the main thread. > > > > Signed-off-by: Kevin Wolf <kw...@redhat.com> > > --- > > block/qcow2.c | 8 ++++++-- > > 1 file changed, 6 insertions(+), 2 deletions(-) > > > > diff --git a/block/qcow2.c b/block/qcow2.c > > index b00b4e7575..7f3948360d 100644 > > --- a/block/qcow2.c > > +++ b/block/qcow2.c > > @@ -1904,6 +1904,8 @@ static void coroutine_fn qcow2_open_entry(void > > *opaque) > > qoc->ret = qcow2_do_open(qoc->bs, qoc->options, qoc->flags, true, > > qoc->errp); > > qemu_co_mutex_unlock(&s->lock); > > + > > + aio_wait_kick(); > > } > > > > static int qcow2_open(BlockDriverState *bs, QDict *options, int flags, > > @@ -1929,8 +1931,10 @@ static int qcow2_open(BlockDriverState *bs, QDict > > *options, int flags, > > > > assert(!qemu_in_coroutine()); > > assert(qemu_get_current_aio_context() == qemu_get_aio_context()); > > - qemu_coroutine_enter(qemu_coroutine_create(qcow2_open_entry, &qoc)); > > - BDRV_POLL_WHILE(bs, qoc.ret == -EINPROGRESS); > > + > > + aio_co_enter(bdrv_get_aio_context(bs), > > + qemu_coroutine_create(qcow2_open_entry, &qoc)); > > + AIO_WAIT_WHILE_UNLOCKED(NULL, qoc.ret == -EINPROGRESS); > > Want to update the doc comment for bdrv_open_file_child() with a warning > that @parent's AioContext can change?
Ok, I'll squash in the following. I seem to remember that bdrv_open_child() is actually wrong, too, regarding AioContext locking, but I didn't need to fix it for this test case. We need more test cases to break everything. :-) And the earlier we can get rid of the AioContext lock the better, because it seems really hard to get right across the board. Kevin diff --git a/block.c b/block.c index a2f8d5a0c0..263e1e22f3 100644 --- a/block.c +++ b/block.c @@ -3644,6 +3644,9 @@ done: * BlockdevRef. * * The BlockdevRef will be removed from the options QDict. + * + * @parent can move to a different AioContext in this functions. Callers must + * make sure that their AioContext locking is still correct after this. */ BdrvChild *bdrv_open_child(const char *filename, QDict *options, const char *bdref_key, @@ -3668,6 +3671,9 @@ BdrvChild *bdrv_open_child(const char *filename, /* * Wrapper on bdrv_open_child() for most popular case: open primary child of bs. + * + * @parent can move to a different AioContext in this functions. Callers must + * make sure that their AioContext locking is still correct after this. */ int bdrv_open_file_child(const char *filename, QDict *options, const char *bdref_key,
signature.asc
Description: PGP signature