Hi Stefan, On 2020/8/12 21:51, Stefan Hajnoczi wrote: > On Mon, Aug 10, 2020 at 10:52:44PM +0800, Zhenyu Ye wrote: >> Before doing qmp actions, we need to lock the qemu_global_mutex, >> so the qmp actions should not take too long time. >> >> Unfortunately, some qmp actions need to acquire aio context and >> this may take a long time. The vm will soft lockup if this time >> is too long. >> >> So add a timeout mechanism while doing qmp actions. > > aio_context_acquire_timeout() is a workaround for code that blocks the > event loop. Ideally there should be no code that blocks the event loop. > > Which cases have you found where the event loop is blocked? >
Currently I only found the io_submit() will block while I/O pressure is too high, for details, see: https://lore.kernel.org/qemu-devel/c6d75e49-3e36-6a76-fdc8-cdf09e7c3...@huawei.com/ io_submit can not ensure non-blocking at any time. > I think they should be investigated and fixed (if possible) before > introducing an API like aio_context_acquire_timeout(). > We cannot ensure that everything is non-blocking in iothread, because some actions seems like asynchronous but will block in some times (such as io_submit). Anyway, the _timeout() API can make these qmp commands (which need to get aio_context) be safer. Thanks, Zhenyu