Kevin Wolf <kw...@redhat.com> writes: > Am 23.05.2023 um 23:39 hat Fabiano Rosas geschrieben: >> We're about to move calls to 'fstat' into the thread-pool to avoid >> blocking VCPU threads should the system call take too long. >> >> To achieve that we first need to make sure none of its callers is >> holding the aio_context lock, otherwise yielding before scheduling the >> aiocb handler would result in a deadlock when the qemu_global_mutex is >> released and another thread tries to acquire the aio_context. >> >> Signed-off-by: Fabiano Rosas <faro...@suse.de> >> --- >> block/qapi.c | 22 +++++++++++++++++++++- >> 1 file changed, 21 insertions(+), 1 deletion(-) >> >> diff --git a/block/qapi.c b/block/qapi.c >> index ae6cd1c2ff..cd197abf1f 100644 >> --- a/block/qapi.c >> +++ b/block/qapi.c >> @@ -222,6 +222,26 @@ int bdrv_query_snapshot_info_list(BlockDriverState *bs, >> return 0; >> } >> >> +static int64_t bdrv_get_actual_size(BlockDriverState *bs) >> +{ >> + int64_t size; >> + AioContext *old_ctx = NULL; >> + >> + if (qemu_in_coroutine()) { > > Hm. Why can't we make sure that it always runs in a coroutine? > > Callers: > > * bdrv_query_block_node_info(). This functions seems to be completely > unused, we can remove it. >
Ok, I'm already removing it in patch 1. > * bdrv_query_image_info(). Called by bdrv_block_device_info() and itself. > bdrv_block_device_info() could become a co_wrapper after swapping the > first two parameters so that it runs in the AioContext of @bs. > We cannot have bdrv_block_device_info() as co_wrapper because that would create its own coroutine and yielding from that would merely have us waiting at bdrv_poll_co. So it doesn't solve the blocking issue. What would work is to make bdrv_block_device_info() a coroutine_fn (without a wrapper). Its two callers, qmp_query_block() and qmp_query_named_block_nodes() are already being moved into the handler coroutine in this series, so it would be mostly a matter of adding the type annotation. > * bdrv_query_block_graph_info(). Only called by qemu-img. Could become a > co_wrapper_bdrv_rdlock. > This one works ok. >> + aio_context_release(bdrv_get_aio_context(bs)); >> + old_ctx = bdrv_co_enter(bs); > > I think this is the wrong function to do this. The caller should already > make sure that it's in the right AioContext. > The issue here is that bdrv_do_query_node_info() calls bdrv_co_get_allocated_file_size(), which is the function we want to make async and therefore needs to run outside of aio_context_acquire/release. However, bdrv_do_query_node_info() also calls bdrv_refresh_filename(), which is GLOBAL_STATE_CODE and therefore wants to be in the main thread context. So entering the bs AioContext at the caller doesn't work when giving the device its own iothread. >> + } >> + >> + size = bdrv_get_allocated_file_size(bs); >> + >> + if (qemu_in_coroutine() && old_ctx) { >> + bdrv_co_leave(bs, old_ctx); >> + aio_context_acquire(bdrv_get_aio_context(bs)); >> + } >> + >> + return size; >> +} > > Kevin