On 11/16/2023 10:14 AM, Markus Armbruster wrote: > Steven Sistare <steven.sist...@oracle.com> writes: >> On 11/15/2023 11:15 AM, Markus Armbruster wrote: >>> Steven Sistare <steven.sist...@oracle.com> writes: >>>> On 11/15/2023 3:41 AM, Markus Armbruster wrote: >>>>> Daniel P. Berrangé <berra...@redhat.com> writes: >>>>> >>>>>> On Fri, Nov 03, 2023 at 03:51:00PM -0400, Steven Sistare wrote: >>>>>>> On 11/3/2023 1:33 PM, Daniel P. Berrangé wrote: >>>>>>>> On Fri, Nov 03, 2023 at 09:01:29AM -0700, Steve Sistare wrote: >>>>>>>>> Buffered monitor output is lost when abort() is called. The pattern >>>>>>>>> error_report() followed by abort() occurs about 60 times, so valuable >>>>>>>>> information is being lost when the abort is called in the context of a >>>>>>>>> monitor command. >>>>>>>> >>>>>>>> I'm curious, was there a particular abort() scenario that you hit ? >>>>>>> >>>>>>> Yes, while tweaking the suspended state, and forgetting to add >>>>>>> transitions: >>>>>>> >>>>>>> error_report("invalid runstate transition: '%s' -> '%s'", >>>>>>> abort(); >>>>>>> >>>>>>> But I have previously hit this for other errors. >>>>> >>>>> Can you provide a reproducer? >>>> >>>> I sometimes hit this when developing new code. I do not have a reproducer >>>> for upstream >>>> branches. The patch is aimed at helping developers, not users. >>> >>> I'm asking because I can't see how the error message could be lost. A >>> reproducer would let me find out. "Apply this set of broken patches, >>> then do that" would serve. >> >> $ patch -p1 << EOF >> diff --git a/monitor/qmp-cmds.c b/monitor/qmp-cmds.c >> index b0f948d..c9a3aee 100644 >> --- a/monitor/qmp-cmds.c >> +++ b/monitor/qmp-cmds.c >> @@ -47,8 +47,12 @@ void qmp_quit(Error **errp) >> qemu_system_shutdown_request(SHUTDOWN_CAUSE_HOST_QMP_QUIT); >> } >> >> +#include "qemu/error-report.h" >> + >> void qmp_stop(Error **errp) >> { >> + error_report("injected failure"); >> + abort(); >> /* if there is a dump in background, we should wait until the dump >> * finished */ >> if (qemu_system_dump_in_progress()) { >> EOF >> >> # This example loses the error message: >> >> $ args='-display none -chardev >> socket,id=mon1,server=on,path=mon1.sock,wait=off -mon mon1,mode=control' >> $ qemu-system-x86_64 $args < /dev/null & >> [1] 18048 >> $ echo '{"execute":"qmp_capabilities"} >> {"execute":"human-monitor-command","arguments":{"command-line":"stop"}}' | >> ncat -U mon1.sock >> {"QMP": {"version": {"qemu": {"micro": 50, "minor": 1, "major": 8}, >> "package": "v8.1.0-2976-g4025fde-dirty"}, "capabilities": ["oob"]}} >> {"return": {}} >> Ncat: Connection reset by peer. >> $ >> [1]+ Aborted qemu-system-x86_64 $args < /dev/null >> >> >> # This example preserves the error message. I include it to show the >> ncat-based test is valid. >> >> $ qemu-system-x86_64 $args < /dev/null & >> [1] 18060 >> $ echo '{"execute":"qmp_capabilities"} {"execute":"stop"}' | ncat -U >> mon1.sock >> {"QMP": {"version": {"qemu": {"micro": 50, "minor": 1, "major": 8}, >> "package": "v8.1.0-2976-g4025fde-dirty"}, "capabilities": ["oob"]}} >> {"return": {}} >> injected failure <============= qemu stderr >> Ncat: Connection reset by peer. >> $ >> [1]+ Aborted qemu-system-x86_64 $args < /dev/null >> >> - Steve > > Reproduced, thanks! > > Confirms my reading of the code. Two cases: > > 1. Normal monitor output > > Flushed on newline. A partial last line can be lost on crash or > exit(). > > Works as intended. > > 2. Output being captured for QMP command human-monitor-command > > Never flushed. Instead, the entire buffer is included in the > command's success response. Naturally there is no response on crash. > > Works as intended. > > Here's how to fish unflushed output out of a core dump: > > (gdb) bt > #0 0x00007efeba52fecc in __pthread_kill_implementation () at > /lib64/libc.so.6 > #1 0x00007efeba4dfab6 in raise () at /lib64/libc.so.6 > #2 0x00007efeba4c97fc in abort () at /lib64/libc.so.6 > #3 0x000055588dcc848c in qmp_stop (errp=0x0) at ../monitor/qmp-cmds.c:53 > #4 0x000055588dcc36ab in hmp_stop (mon=0x7ffc01d47cc0, > qdict=0x555891d19200) > at ../monitor/hmp-cmds.c:119 > #5 0x000055588dcc714e in handle_hmp_command_exec > (mon=0x7ffc01d47cc0, cmd=0x55588f0689d0 <hmp_cmds+7760>, > qdict=0x555891d19200) at ../monitor/hmp.c:1106 > #6 0x000055588dcc737b in handle_hmp_command > (mon=0x7ffc01d47cc0, cmdline=0x7efea80036f4 "") at > ../monitor/hmp.c:1158 > #7 0x000055588dcc8922 in qmp_human_monitor_command > (command_line=0x7efea80036f0 "stop", has_cpu_index=false, > cpu_index=0, errp=0x7ffc01d47dd0) at ../monitor/qmp-cmds.c:184 > > [...] > > (gdb) up 4 > #4 0x000055588dcc36ab in hmp_stop (mon=0x7ffc01d47cc0, > qdict=0x555891d19200) > at ../monitor/hmp-cmds.c:119 > 119 qmp_stop(NULL); > (gdb) p mon->outbuf > $1 = (GString *) 0x555890a65260 > (gdb) p *$ > $2 = {str = 0x7efea8002c00 "injected failure\r\n", len = 18, > allocated_len = 128}
Yes. thanks. With the patch, the computer does the work for us, and sometimes the message is definitive enough to fix without further investigation, eg invalid runstate transition: 'suspended -> 'save-vm' But, it's no big deal, and I withdraw the patch request. Thanks for looking at it. - Steve