Re: [PATCH] monitor: flush messages on abort

Markus Armbruster Wed, 15 Nov 2023 08:56:39 -0800

Steven Sistare <steven.sist...@oracle.com> writes:

> On 11/6/2023 5:10 AM, Daniel P. Berrangé wrote:
>> On Fri, Nov 03, 2023 at 03:51:00PM -0400, Steven Sistare wrote:
>>> On 11/3/2023 1:33 PM, Daniel P. Berrangé wrote:
>>>> On Fri, Nov 03, 2023 at 09:01:29AM -0700, Steve Sistare wrote:
>>>>> Buffered monitor output is lost when abort() is called.  The pattern
>>>>> error_report() followed by abort() occurs about 60 times, so valuable
>>>>> information is being lost when the abort is called in the context of a
>>>>> monitor command.
>>>>
>>>> I'm curious, was there a particular abort() scenario that you hit ?
>>>
>>> Yes, while tweaking the suspended state, and forgetting to add transitions:
>>>
>>>         error_report("invalid runstate transition: '%s' -> '%s'",
>>>         abort();
>>>
>>> But I have previously hit this for other errors.
>>>
>>>> For some crude statistics:
>>>>
>>>>   $ for i in abort return exit goto ; do echo -n "$i: " ; git grep --after 
>>>> 1 error_report | grep $i | wc -l ; done
>>>>   abort: 47
>>>>   return: 512
>>>>   exit: 458
>>>>   goto: 177
>>>>
>>>> to me those numbers say that calling "abort()" after error_report
>>>> should be considered a bug, and we can blanket replace all the
>>>> abort() calls with exit(EXIT_FAILURE), and thus avoid the need to
>>>> special case flushing the monitor.
>>>
>>> And presumably add an atexit handler to flush the monitor ala monitor_abort.
>>> AFAICT currently no destructor is called for the monitor at exit time.
>> 
>> The HMP monitor flushes at each newline,  and exit() will take care of
>> flushing stdout, so I don't think there's anything else needed.
>> 
>>>> Also I think there's a decent case to be made for error_report()
>>>> to call monitor_flush().
>>>
>>> A good start, but that would not help for monitors with skip_flush=true, 
>>> which 
>>> need to format the buffered string in a json response, which is the case I 
>>> tripped over.
>> 
>> 'skip_flush' is only set to 'true' when using a QMP monitor and invoking
>> "hmp-monitor-command".
>
> OK, that is narrower than I thought.  Now I see that other QMP monitors send 
> error_report() to stderr, hence it is visible after abort and exit:
>
> int error_vprintf(const char *fmt, va_list ap) {
>     if (cur_mon && !monitor_cur_is_qmp())
>         return monitor_vprintf(cur_mon, fmt, ap);
>     return vfprintf(stderr, fmt, ap);                <-- HERE
>
> That surprises me, I thought that would be returned to the monitor caller in 
> the
> json response. I guess the rationale is that the "main" error, if any, will be
> set and returned by the err object that is passed down the stack during 
> command
> evaluation.


Three cases:

1. !cur_mon

   Not executing a monitor command.  We want to report errors etc to
   stderr.

2. cur_mon && !monitor_cur_is_qmp()

   Executing an HMP command.  We want to report errors to the current
   monitor.

2. cur_mon && monitor_cur_is_qmp()

   Executing a QMP command.  What we want is less obvious.

   Somewhere up the call stack is the QMP command's handler function.
   It takes an Error **errp argument.

   Within such a function, any errors need to be passed up the call
   chain into that argument.  Reporting them with error_report() is
   *wrong*.  Reporting must be left to the function's caller.

   A QMP command handler returns it output, it doesn't print it.  So
   calling monitor_printf() is wrong, too.

   But what about warn_report()?  Is that wrong, too?  We decided it's
   not, mostly because we have nothing else to offer.

   The stupidest way to keep it useful in QMP command context is to have
   error_vprintf() print to stderr.  So that's what it does.

   We could instead accumulate error_vprintf() output in a buffer, and
   include it with the QMP reply.  However, it's not clear what a
   management application could do with it.  So we stick to stupid.

[...]

Re: [PATCH] monitor: flush messages on abort

Reply via email to