Il 02/04/2013 11:04, Peter Crosthwaite ha scritto: > Public bug: 1154328 > Broken Commit: a29753f8aa79a34a324afebe340182a51a5aef11 > > ATM, the timeout from g_pollfds_fill is inhibiting unlocking of the > iothread. This is capable of causing a total deadlock when hw/serial > is used as a device. The bug manifests when you go -nographic -serial > mon:stdio and then paste 40 or more chars into the terminal. > > My knowledge of this g_foo is vague at best, but my best working > theory is this: > > - First 8 chars are recieved by the serial device no complaints. > - The next 32 chars, serial returns false for can_receive() so they > are buffered by the MuxDriver object - mux_chr_read() > - Buffer is full, so 41st char causes false return from Muxes own > can_read() > - This propagates all the way up to glib_pollfds_fill and manifests > as a timeout
I suppose you mean "manifests as timeout==0". The question is *which* GSource has a timeout of zero? Not the mux's: if mux_chr_can_read() returns zero, the prepare function returns FALSE without touching the timeout at all... static gboolean io_watch_poll_prepare(GSource *source, gint *timeout_) { IOWatchPoll *iwp = io_watch_poll_from_source(source); iwp->max_size = iwp->fd_can_read(iwp->opaque); if (iwp->max_size == 0) { return FALSE; } return g_io_watch_funcs.prepare(source, timeout_); } > - Timeout means no unlock of IOthread. Device land never sees any more > cycles so the serial port never progresses - no flushing of > buffer Still, this is plausible, so the patch looks correct. Paolo > - Deadlock > > Tested on petalogix_ml605 microblazeel machine model, which was faulty > due to 1154328. > > Fix by removing the conditions on unlocking the iothread. Don't know > what else this will break but the timeout is certainly the wrong > condition for the unlock. Probably the real solution is to have a more > selective unlock policy. > > I'm happy for someone to take this patch off my hands, or educate me on > the correct implementation. For the peeps doing automated testing on > nographic platforms this will get your build working again. > > Signed-off-by: Peter Crosthwaite <peter.crosthwa...@xilinx.com> > --- > main-loop.c | 8 ++------ > 1 files changed, 2 insertions(+), 6 deletions(-) > > diff --git a/main-loop.c b/main-loop.c > index eb80ff3..a376898 100644 > --- a/main-loop.c > +++ b/main-loop.c > @@ -194,15 +194,11 @@ static int os_host_main_loop_wait(uint32_t timeout) > > glib_pollfds_fill(&timeout); > > - if (timeout > 0) { > - qemu_mutex_unlock_iothread(); > - } > + qemu_mutex_unlock_iothread(); > > ret = g_poll((GPollFD *)gpollfds->data, gpollfds->len, timeout); > > - if (timeout > 0) { > - qemu_mutex_lock_iothread(); > - } > + qemu_mutex_lock_iothread(); > > glib_pollfds_poll(); > return ret; >