On Wed, Apr 30, 2025 at 10:11 AM Andrey Drobyshev <andrey.drobys...@virtuozzo.com> wrote: > > On 4/30/25 11:47 AM, Fiona Ebner wrote: > > Am 24.04.25 um 19:32 schrieb Andrey Drobyshev: > >> So it looks like main thread is processing job-dismiss request and is > >> holding write lock taken in block_job_remove_all_bdrv() (frame #20 > >> above). At the same time iothread spawns a coroutine which performs IO > >> request. Before the coroutine is spawned, blk_aio_prwv() increases > >> 'in_flight' counter for Blk. Then blk_co_do_preadv_part() (frame #5) is > >> trying to acquire the read lock. But main thread isn't releasing the > >> lock as blk_root_drained_poll() returns true since blk->in_flight > 0. > >> Here's the deadlock. > > > > And for the IO test you provided, it's client->nb_requests that behaves > > similarly to blk->in_flight here. > > > > The issue also reproduces easily when issuing the following QMP command > > in a loop while doing IO on a device: > > > >> void qmp_block_locked_drain(const char *node_name, Error **errp) > >> { > >> BlockDriverState *bs; > >> > >> bs = bdrv_find_node(node_name); > >> if (!bs) { > >> error_setg(errp, "node not found"); > >> return; > >> } > >> > >> bdrv_graph_wrlock(); > >> bdrv_drained_begin(bs); > >> bdrv_drained_end(bs); > >> bdrv_graph_wrunlock(); > >> } > > > > It seems like either it would be necessary to require: > > 1. not draining inside an exclusively locked section > > or > > 2. making sure that variables used by drained_poll routines are only set > > while holding the reader lock > > ? > > > > Those seem to require rather involved changes, so a third option might > > be to make draining inside an exclusively locked section possible, by > > embedding such locked sections in a drained section: > > > >> diff --git a/blockjob.c b/blockjob.c > >> index 32007f31a9..9b2f3b3ea9 100644 > >> --- a/blockjob.c > >> +++ b/blockjob.c > >> @@ -198,6 +198,7 @@ void block_job_remove_all_bdrv(BlockJob *job) > >> * one to make sure that such a concurrent access does not attempt > >> * to process an already freed BdrvChild. > >> */ > >> + bdrv_drain_all_begin(); > >> bdrv_graph_wrlock(); > >> while (job->nodes) { > >> GSList *l = job->nodes; > >> @@ -211,6 +212,7 @@ void block_job_remove_all_bdrv(BlockJob *job) > >> g_slist_free_1(l); > >> } > >> bdrv_graph_wrunlock(); > >> + bdrv_drain_all_end(); > >> } > >> > >> bool block_job_has_bdrv(BlockJob *job, BlockDriverState *bs) > > > > This seems to fix the issue at hand. I can send a patch if this is > > considered an acceptable approach.
Kevin is aware of this thread but it's a public holiday tomorrow so it may be a little longer. Stefan