Am 02.07.2015 um 08:33 schrieb Fam Zheng: > bdrv_flush() uses a loop like > > while (rwco.ret == NOT_DONE) { > aio_poll(aio_context, true); > } > > to wait for thread pool, which may not get notified about the scheduled > BH right away, if there is no new event that wakes up a blocking > qemu_poll_ns(). In this case, it may even be a permanent hang. > > Wake the main thread up by writing to the event notifier fd. > > Cc: Paolo Bonzini <pbonz...@redhat.com> > Cc: Christian Borntraeger <borntrae...@de.ibm.com> > Signed-off-by: Fam Zheng <f...@redhat.com> > > --- > > I suspect this may relate to > > [Qemu-devel] "iothread: release iothread around aio_poll" causes random > hangs at startup > > [http://lists.nongnu.org/archive/html/qemu-devel/2015-06/msg00623.html] > > reported by Christian Borntraeger. Because in iothread there is rarely > any fd activity, so the blocking aio_poll() may block forever if it > misses the BH schedule. > > Christian, could you test this patch against your reproducer?
Still does not work. It really seems to be triggered by the null device (and there must be >= 2). > --- > thread-pool.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/thread-pool.c b/thread-pool.c > index ac909f4..9b9c065 100644 > --- a/thread-pool.c > +++ b/thread-pool.c > @@ -112,6 +112,7 @@ static void *worker_thread(void *opaque) > qemu_mutex_lock(&pool->lock); > > qemu_bh_schedule(pool->completion_bh); > + aio_notify(pool->ctx); > } > > pool->cur_threads--; >