On 4/30/25 11:47 AM, Fiona Ebner wrote:
> Am 24.04.25 um 19:32 schrieb Andrey Drobyshev:
>> So it looks like main thread is processing job-dismiss request and is
>> holding write lock taken in block_job_remove_all_bdrv() (frame #20
>> above).  At the same time iothread spawns a coroutine which performs IO
>> request.  Before the coroutine is spawned, blk_aio_prwv() increases
>> 'in_flight' counter for Blk.  Then blk_co_do_preadv_part() (frame #5) is
>> trying to acquire the read lock.  But main thread isn't releasing the
>> lock as blk_root_drained_poll() returns true since blk->in_flight > 0.
>> Here's the deadlock.
> 
> And for the IO test you provided, it's client->nb_requests that behaves
> similarly to blk->in_flight here.
> 
> The issue also reproduces easily when issuing the following QMP command
> in a loop while doing IO on a device:
> 
>> void qmp_block_locked_drain(const char *node_name, Error **errp)
>> {
>>     BlockDriverState *bs;
>>
>>     bs = bdrv_find_node(node_name);
>>     if (!bs) {
>>         error_setg(errp, "node not found");
>>         return;
>>     }
>>
>>     bdrv_graph_wrlock();
>>     bdrv_drained_begin(bs);
>>     bdrv_drained_end(bs);
>>     bdrv_graph_wrunlock();
>> }
> 
> It seems like either it would be necessary to require:
> 1. not draining inside an exclusively locked section
> or
> 2. making sure that variables used by drained_poll routines are only set
> while holding the reader lock
> ?
> 
> Those seem to require rather involved changes, so a third option might
> be to make draining inside an exclusively locked section possible, by
> embedding such locked sections in a drained section:
> 
>> diff --git a/blockjob.c b/blockjob.c
>> index 32007f31a9..9b2f3b3ea9 100644
>> --- a/blockjob.c
>> +++ b/blockjob.c
>> @@ -198,6 +198,7 @@ void block_job_remove_all_bdrv(BlockJob *job)
>>       * one to make sure that such a concurrent access does not attempt
>>       * to process an already freed BdrvChild.
>>       */
>> +    bdrv_drain_all_begin();
>>      bdrv_graph_wrlock();
>>      while (job->nodes) {
>>          GSList *l = job->nodes;
>> @@ -211,6 +212,7 @@ void block_job_remove_all_bdrv(BlockJob *job)
>>          g_slist_free_1(l);
>>      }
>>      bdrv_graph_wrunlock();
>> +    bdrv_drain_all_end();
>>  }
>>  
>>  bool block_job_has_bdrv(BlockJob *job, BlockDriverState *bs)
> 
> This seems to fix the issue at hand. I can send a patch if this is
> considered an acceptable approach.
> 
> Best Regards,
> Fiona
> 

Hello Fiona,

Thanks for looking into it.  I've tried your 3rd option above and can
confirm it does fix the deadlock, at least I can't reproduce it.  Other
iotests also don't seem to be breaking.  So I personally am fine with
that patch.  Would be nice to hear a word from the maintainers though on
whether there're any caveats with such approach.

Andrey

Reply via email to