Re: [RFC PATCH 5/6] block: Allow bdrv_get_allocated_file_size to run in bdrv context

Fabiano Rosas Mon, 29 May 2023 10:48:31 -0700

Kevin Wolf <kw...@redhat.com> writes:

> Am 23.05.2023 um 23:39 hat Fabiano Rosas geschrieben:
>> We're about to move calls to 'fstat' into the thread-pool to avoid
>> blocking VCPU threads should the system call take too long.
>> 
>> To achieve that we first need to make sure none of its callers is
>> holding the aio_context lock, otherwise yielding before scheduling the
>> aiocb handler would result in a deadlock when the qemu_global_mutex is
>> released and another thread tries to acquire the aio_context.
>> 
>> Signed-off-by: Fabiano Rosas <faro...@suse.de>
>> ---
>>  block/qapi.c | 22 +++++++++++++++++++++-
>>  1 file changed, 21 insertions(+), 1 deletion(-)
>> 
>> diff --git a/block/qapi.c b/block/qapi.c
>> index ae6cd1c2ff..cd197abf1f 100644
>> --- a/block/qapi.c
>> +++ b/block/qapi.c
>> @@ -222,6 +222,26 @@ int bdrv_query_snapshot_info_list(BlockDriverState *bs,
>>      return 0;
>>  }
>>  
>> +static int64_t bdrv_get_actual_size(BlockDriverState *bs)
>> +{
>> +    int64_t size;
>> +    AioContext *old_ctx = NULL;
>> +
>> +    if (qemu_in_coroutine()) {
>
> Hm. Why can't we make sure that it always runs in a coroutine?
>
> Callers:
>
> * bdrv_query_block_node_info(). This functions seems to be completely
>   unused, we can remove it.
>


Ok, I'm already removing it in patch 1.

> * bdrv_query_image_info(). Called by bdrv_block_device_info() and itself.
>   bdrv_block_device_info() could become a co_wrapper after swapping the
>   first two parameters so that it runs in the AioContext of @bs.
>

We cannot have bdrv_block_device_info() as co_wrapper because that would
create its own coroutine and yielding from that would merely have us
waiting at bdrv_poll_co. So it doesn't solve the blocking issue.

What would work is to make bdrv_block_device_info() a coroutine_fn
(without a wrapper). Its two callers, qmp_query_block() and
qmp_query_named_block_nodes() are already being moved into the handler
coroutine in this series, so it would be mostly a matter of adding the
type annotation.

> * bdrv_query_block_graph_info(). Only called by qemu-img. Could become a
>   co_wrapper_bdrv_rdlock.
>

This one works ok.

>> +        aio_context_release(bdrv_get_aio_context(bs));
>> +        old_ctx = bdrv_co_enter(bs);
>
> I think this is the wrong function to do this. The caller should already
> make sure that it's in the right AioContext.
>

The issue here is that bdrv_do_query_node_info() calls
bdrv_co_get_allocated_file_size(), which is the function we want to make
async and therefore needs to run outside of aio_context_acquire/release.
However, bdrv_do_query_node_info() also calls bdrv_refresh_filename(),
which is GLOBAL_STATE_CODE and therefore wants to be in the main thread
context. So entering the bs AioContext at the caller doesn't work when
giving the device its own iothread.

>> +    }
>> +
>> +    size = bdrv_get_allocated_file_size(bs);
>> +
>> +    if (qemu_in_coroutine() && old_ctx) {
>> +        bdrv_co_leave(bs, old_ctx);
>> +        aio_context_acquire(bdrv_get_aio_context(bs));
>> +    }
>> +
>> +    return size;
>> +}
>
> Kevin

Re: [RFC PATCH 5/6] block: Allow bdrv_get_allocated_file_size to run in bdrv context

Reply via email to