bdrv_flush() uses a loop like while (rwco.ret == NOT_DONE) { aio_poll(aio_context, true); }
to wait for thread pool, which may not get notified about the scheduled BH right away, if there is no new event that wakes up a blocking qemu_poll_ns(). In this case, it may even be a permanent hang. Wake the main thread up by writing to the event notifier fd. Cc: Paolo Bonzini <pbonz...@redhat.com> Cc: Christian Borntraeger <borntrae...@de.ibm.com> Signed-off-by: Fam Zheng <f...@redhat.com> --- I suspect this may relate to [Qemu-devel] "iothread: release iothread around aio_poll" causes random hangs at startup [http://lists.nongnu.org/archive/html/qemu-devel/2015-06/msg00623.html] reported by Christian Borntraeger. Because in iothread there is rarely any fd activity, so the blocking aio_poll() may block forever if it misses the BH schedule. Christian, could you test this patch against your reproducer? --- thread-pool.c | 1 + 1 file changed, 1 insertion(+) diff --git a/thread-pool.c b/thread-pool.c index ac909f4..9b9c065 100644 --- a/thread-pool.c +++ b/thread-pool.c @@ -112,6 +112,7 @@ static void *worker_thread(void *opaque) qemu_mutex_lock(&pool->lock); qemu_bh_schedule(pool->completion_bh); + aio_notify(pool->ctx); } pool->cur_threads--; -- 2.4.3