Previously we cleanup the queues when we got CLOSED event. It was used to make sure we won't leftover replies/events of a old client to a new client. Now this patch postpones that until OPENED.
In most cases, it does not really matter much since either way will make sure that the new client won't receive unexpected old events/responses. However there can be a very rare race condition with the old way when we use QMP with stdio and pipelines (which is quite common in iotests). The problem is that, we can easily have something like this in scripts: cat $QMP_COMMANDS | qemu -qmp stdio ... | filter_commands In most cases, a QMP session will be based on a bidirectional channel (read/write to a TCP port, for example), so IN and OUT are sharing some fundamentally same thing. However that's untrue for pipes above - the IN is the "cat" program, while the "OUT" is directed to the "filter_commands" which is actually another process. Now when we received the CLOSED event, we cleanup the queues (including the QMP response queue). That means, if we kill/stop the "cat" process faster than the filter_commands process, the filter_commands process is possible to miss some responses/events that should belong to it. In practise, I encountered a very strange SHUTDOWN event missing when running test with iotest 087. Without this patch, iotest 078 will have ~10% chance to miss the SHUTDOWN event and fail when with Out-Of-Band enabled: 087 8s ... - output mismatch (see 087.out.bad) --- /home/peterx/git/qemu/tests/qemu-iotests/087.out 2018-06-01 18:44:22.378982462 +0800 +++ /home/peterx/git/qemu/bin/tests/qemu-iotests/087.out.bad 2018-06-01 18:53:44.267840928 +0800 @@ -8,7 +8,6 @@ {"return": {}} {"error": {"class": "GenericError", "desc": "'node-name' must be specified for the root node"}} {"return": {}} -{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}} === Duplicate ID === @@ -53,7 +52,6 @@ {"return": {}} {"return": {}} {"return": {}} -{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}} This patch fixes the problem. Fixes: 6d2d563f8c ("qmp: cleanup qmp queues properly", 2018-03-27) Signed-off-by: Peter Xu <pet...@redhat.com> Signed-off-by: Peter Xu <pet...@redhat.com> --- monitor.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/monitor.c b/monitor.c index 46814af533..6f26108108 100644 --- a/monitor.c +++ b/monitor.c @@ -4354,6 +4354,7 @@ static void monitor_qmp_event(void *opaque, int event) switch (event) { case CHR_EVENT_OPENED: + monitor_qmp_cleanup_queues(mon); mon->qmp.commands = &qmp_cap_negotiation_commands; monitor_qmp_caps_reset(mon); data = get_qmp_greeting(mon); @@ -4362,7 +4363,6 @@ static void monitor_qmp_event(void *opaque, int event) mon_refcount++; break; case CHR_EVENT_CLOSED: - monitor_qmp_cleanup_queues(mon); json_message_parser_destroy(&mon->qmp.parser); json_message_parser_init(&mon->qmp.parser, handle_qmp_command); mon_refcount--; -- 2.17.0