On 08/07/14 19:08, Paolo Bonzini wrote: > Il 08/07/2014 17:59, Stefan Hajnoczi ha scritto: >> I sent Christian an initial patch to fix this but now both threads are >> stuck in rfifolock_lock() inside cond wait. That's very strange and >> should never happen. > > I had this patch pending for 2.2: > > commit 6c81e31615c3cda5ea981a998ba8b1b8ed17de6f > Author: Paolo Bonzini <pbonz...@redhat.com> > Date: Mon Jul 7 10:39:49 2014 +0200 > > iothread: do not rely on aio_poll(ctx, true) result to end a loop > > Currently, whenever aio_poll(ctx, true) has completed all pending > work it returns true *and* the next call to aio_poll(ctx, true) > will not block. > > This invariant has its roots in qemu_aio_flush()'s implementation > as "while (qemu_aio_wait()) {}". However, qemu_aio_flush() does > not exist anymore and bdrv_drain_all() is implemented differently; > and this invariant is complicated to maintain and subtly different > from the return value of GMainLoop's g_main_context_iteration. > > All calls to aio_poll(ctx, true) except one are guarded by a > while() loop checking for a request to be incomplete, or a > BlockDriverState to be idle. Modify that one exception in > iothread.c. > > Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
The hangs are gone. Looks like 2.1 material now... Acked-by: Christian Borntraeger <borntrae...@de.ibm.com> Tested-by: Christian Borntraeger <borntrae...@de.ibm.com> > > diff --git a/iothread.c b/iothread.c > index 1fbf9f1..d9403cf 100644 > --- a/iothread.c > +++ b/iothread.c > @@ -30,6 +30,7 @@ typedef ObjectClass IOThreadClass; > static void *iothread_run(void *opaque) > { > IOThread *iothread = opaque; > + bool blocking; > > qemu_mutex_lock(&iothread->init_done_lock); > iothread->thread_id = qemu_get_thread_id(); > @@ -38,8 +39,10 @@ static void *iothread_run(void *opaque) > > while (!iothread->stopping) { > aio_context_acquire(iothread->ctx); > - while (!iothread->stopping && aio_poll(iothread->ctx, true)) { > + blocking = true; > + while (!iothread->stopping && aio_poll(iothread->ctx, blocking)) { > /* Progress was made, keep going */ > + blocking = false; > } > aio_context_release(iothread->ctx); > } > > Christian, can you test it? > > Paolo >