Luiz Capitulino <lcapitul...@redhat.com> writes: > On Sun, 26 May 2013 10:33:39 -0500 > Michael Roth <mdr...@linux.vnet.ibm.com> wrote: > >> In the past, CHR_EVENT_OPENED events were emitted via a pre-expired >> QEMUTimer. Due to timers being processing at the tail end of each main >> loop iteration, this generally meant that such events would be emitted >> within the same main loop iteration, prior any client data being read >> by tcp/unix socket server backends. >> >> With 9f939df955a4152aad69a19a77e0898631bb2c18, this was switched to >> a "bottom-half" that is registered via g_idle_add(). This makes it >> likely that the event won't be sent until a subsequent iteration, and >> also adds the possibility that such events will race with the >> processing of client data. >> >> In cases where we rely on the CHR_EVENT_OPENED event being delivered >> prior to any client data being read, this may cause unexpected behavior. >> >> In the case of a QMP monitor session using a socket backend, the delayed >> processing of the CHR_EVENT_OPENED event can lead to a situation where >> a previous session, where 'qmp_capabilities' has already processed, can >> cause the rejection of 'qmp_capabilities' for a subsequent session, >> since we reset capabilities in response to CHR_EVENT_OPENED, which may >> not yet have been delivered. This can be reproduced with the following >> command, generally within 50 or so iterations: >> >> mdroth@loki:~$ cat cmds >> {'execute':'qmp_capabilities'} >> {'execute':'query-cpus'} >> mdroth@loki:~$ while true; do if socat stdio unix-connect:/tmp/qmp.sock >> <cmds | grep CommandNotFound &>/dev/null; then echo failed; break; else >> echo ok; fi; done; >> ok >> ok >> failed >> mdroth@loki:~$ >> >> As a work-around, we reset capabilities as part of the CHR_EVENT_CLOSED >> hook, which gets called as part of the GIOChannel cb associated with the >> client rather than a bottom-half, and is thus guaranteed to be delivered >> prior to accepting any subsequent client sessions. >> >> This does not address the more general problem of whether or not there >> are chardev users that rely on CHR_EVENT_OPENED being delivered prior to >> client data, and whether or not we should consider moving CHR_EVENT_OPENED >> "in-band" with connection establishment as a general solution, but fixes >> QMP for the time being. >> >> Reported-by: Stefan Priebe <s.pri...@profihost.ag> >> Cc: qemu-sta...@nongnu.org >> Signed-off-by: Michael Roth <mdr...@linux.vnet.ibm.com> > > Thanks for debugging this Michael. I'm going to apply your patch to the qmp > branch because we can't let this broken (esp. in -stable) but this is papering > over the real bug...
Do we really need OPENED to happen in a bottom half? Shouldn't the open event handlers register the callback instead if they need it? Regards, Anthony Liguori > >> --- >> v1->v2: >> * remove command_mode reset from CHR_EVENT_OPENED case, since this >> might still cause a race >> >> monitor.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/monitor.c b/monitor.c >> index 6ce2a4e..f1953a0 100644 >> --- a/monitor.c >> +++ b/monitor.c >> @@ -4649,7 +4649,6 @@ static void monitor_control_event(void *opaque, int >> event) >> >> switch (event) { >> case CHR_EVENT_OPENED: >> - mon->mc->command_mode = 0; >> data = get_qmp_greeting(); >> monitor_json_emitter(mon, data); >> qobject_decref(data); >> @@ -4660,6 +4659,7 @@ static void monitor_control_event(void *opaque, int >> event) >> json_message_parser_init(&mon->mc->parser, handle_qmp_command); >> mon_refcount--; >> monitor_fdsets_cleanup(); >> + mon->mc->command_mode = 0; >> break; >> } >> }