Am 03.06.24 um 18:21 schrieb Kevin Wolf: > Am 03.06.2024 um 16:17 hat Fiona Ebner geschrieben: >> Am 26.03.24 um 13:44 schrieb Kevin Wolf: >>> >>> The fix for bdrv_flush_all() is probably to make it bdrv_co_flush_all() >>> with a coroutine wrapper so that the graph lock is held for the whole >>> function. Then calling bdrv_co_flush() while iterating the list is safe >>> and doesn't allow concurrent graph modifications. >> >> The second is that iotest 255 ran into an assertion failure upon QMP 'quit': >> >>> ../block/graph-lock.c:113: bdrv_graph_wrlock: Assertion >>> `!qemu_in_coroutine()' failed. >> >> Looking at the backtrace: >> >>> #5 0x0000762a90cc3eb2 in __GI___assert_fail >>> (assertion=0x5afb07991e7d "!qemu_in_coroutine()", file=0x5afb07991e00 >>> "../block/graph-lock.c", line=113, function=0x5afb07991f20 >>> <__PRETTY_FUNCTION__.4> "bdrv_graph_wrlock") >>> at ./assert/assert.c:101 >>> #6 0x00005afb07585311 in bdrv_graph_wrlock () at ../block/graph-lock.c:113 >>> #7 0x00005afb07573a36 in blk_remove_bs (blk=0x5afb0af99420) at >>> ../block/block-backend.c:901 >>> #8 0x00005afb075729a7 in blk_delete (blk=0x5afb0af99420) at >>> ../block/block-backend.c:487 >>> #9 0x00005afb07572d88 in blk_unref (blk=0x5afb0af99420) at >>> ../block/block-backend.c:547 >>> #10 0x00005afb07572fe8 in bdrv_next (it=0x762a852fef00) at >>> ../block/block-backend.c:618 >>> #11 0x00005afb0758cd65 in bdrv_co_flush_all () at ../block/io.c:2347 >>> #12 0x00005afb0753ba37 in bdrv_co_flush_all_entry (opaque=0x7ffff12c6050) >>> at block/block-gen.c:1391 >>> #13 0x00005afb0773bf41 in coroutine_trampoline (i0=168365184, i1=23291) >> >> So I guess calling bdrv_next() is not safe from a coroutine, because >> the function doing the iteration could end up being the last thing to >> have a reference for the BB. > > Does your bdrv_co_flush_all() take the graph (reader) lock? If so, this > is surprising, because while we hold the graph lock, no reference should > be able to go away - you need the writer lock for that and you won't get > it as long as bdrv_co_flush_all() locks the graph. So whatever had a > reference before the bdrv_next() loop must still have it now. Do you > know where it gets dropped? >
AFAICT, yes, it does hold the graph reader lock. The generated code is: > static void coroutine_fn bdrv_co_flush_all_entry(void *opaque) > { > BdrvFlushAll *s = opaque; > > bdrv_graph_co_rdlock(); > s->ret = bdrv_co_flush_all(); > bdrv_graph_co_rdunlock(); > s->poll_state.in_progress = false; > > aio_wait_kick(); > } Apparently when the mirror job is aborted/exits, which can happen during the polling for bdrv_co_flush_all_entry(), a reference can go away without the write lock (at least my breakpoints didn't trigger) being held: > #0 blk_unref (blk=0x5cdefe943d20) at ../block/block-backend.c:537 > #1 0x00005cdefb26697e in mirror_exit_common (job=0x5cdefeb53000) at > ../block/mirror.c:710 > #2 0x00005cdefb263575 in mirror_abort (job=0x5cdefeb53000) at > ../block/mirror.c:823 > #3 0x00005cdefb2248a6 in job_abort (job=0x5cdefeb53000) at ../job.c:825 > #4 0x00005cdefb2245f2 in job_finalize_single_locked (job=0x5cdefeb53000) at > ../job.c:855 > #5 0x00005cdefb223852 in job_completed_txn_abort_locked (job=0x5cdefeb53000) > at ../job.c:958 > #6 0x00005cdefb223714 in job_completed_locked (job=0x5cdefeb53000) at > ../job.c:1065 > #7 0x00005cdefb224a8b in job_exit (opaque=0x5cdefeb53000) at ../job.c:1088 > #8 0x00005cdefb4134fc in aio_bh_call (bh=0x5cdefe7487c0) at > ../util/async.c:171 > #9 0x00005cdefb4136ce in aio_bh_poll (ctx=0x5cdefd9cd750) at > ../util/async.c:218 > #10 0x00005cdefb3efdfd in aio_poll (ctx=0x5cdefd9cd750, blocking=true) at > ../util/aio-posix.c:722 > #11 0x00005cdefb20435e in bdrv_poll_co (s=0x7ffe491621d8) at > ../block/block-gen.h:43 > #12 0x00005cdefb206a33 in bdrv_flush_all () at block/block-gen.c:1410 > #13 0x00005cdefae5c8ed in do_vm_stop (state=RUN_STATE_SHUTDOWN, > send_stop=false) > at ../system/cpus.c:297 > #14 0x00005cdefae5c850 in vm_shutdown () at ../system/cpus.c:308 > #15 0x00005cdefae6d892 in qemu_cleanup (status=0) at ../system/runstate.c:871 > #16 0x00005cdefb1a7e78 in qemu_default_main () at ../system/main.c:38 > #17 0x00005cdefb1a7eb8 in main (argc=34, argv=0x7ffe491623a8) at > ../system/main.c:48 Looking at the code in mirror_exit_common(), it doesn't seem to acquire a write lock: > bdrv_graph_rdunlock_main_loop(); > > /* > * Remove target parent that still uses BLK_PERM_WRITE/RESIZE before > * inserting target_bs at s->to_replace, where we might not be able to get > * these permissions. > */ > blk_unref(s->target); > s->target = NULL; The write lock is taken in blk_remove_bs() when the refcount drops to 0 and the BB is actually removed: > bdrv_graph_wrlock(); > bdrv_root_unref_child(root); > bdrv_graph_wrunlock(); Best Regards, Fiona