Re: [Qemu-devel] [RFC] error: auto propagated local_err

Eric Blake Thu, 19 Sep 2019 06:57:51 -0700

On 9/19/19 4:17 AM, Kevin Wolf wrote:
> Am 18.09.2019 um 19:10 hat Eric Blake geschrieben:
>> On 9/18/19 8:02 AM, Vladimir Sementsov-Ogievskiy wrote:
>>> + */
>>> +#define MAKE_ERRP_SAFE(errp) \
>>> +g_auto(ErrorPropagationStruct) (__auto_errp_prop) = {.errp = (errp)}; \
>>> +if ((errp) == NULL || *(errp) == error_abort || *(errp) == error_fatal) { \
>>> +    (errp) = &__auto_errp_prop.local_err; \
>>> +}
>>
>> Not written to take a trailing semicolon in the caller.
>>
>> You could even set __auto_errp_prop unconditionally rather than trying
>> to reuse incoming errp (the difference being that error_propagate() gets
>> called more frequently).
> 
> I think this difference is actually a problem.
> 
> When debugging things, I hate error_propagate(). It means that the Error
> (specifically its fields src/func/line) points to the outermost
> error_propagate() rather than the place where the error really happened.
> It also makes error_abort completely useless because at the point where
> the process gets aborted, the interesting information is already lost.


Okay, based on that, I see the following desirable semantics:

Caller: one of 4 calling paradigms:

pass errp=NULL (we don't care about failures)
pass errp=&error_abort (we want to abort() as soon as possible as close
to the real problem as possible)
pass errp=&error_fatal (we want to exit(), but only after collecting as
much information as possible)
pass errp = anything else (we are collecting an error for other reasons,
we may report it or let the caller decide or ...)

Callee: we want a SINGLE paradigm:

func (Error **errp)
{
    MAKE_ERRP_SAFE();

    now we can pass errp to any child function, test '*errp', or do
anything else, and we DON'T have to call error_propagate.

I think that means we need:

#define MAKE_ERRP_SAFE() \
  g_auto(...) __auto_errp = { .errp = errp }; \
  do { \
    if (!errp || errp == &error_fatal) { errp = &__auto_errp.local; } \
  } while (0)

So back to the caller semantics:

if the caller passed NULL, we've redirected errp locally so that we can
use *errp at will; the auto-cleanup will free our local error.

if the caller passed &error_abort, we keep errp unchanged.  *errp tests
will never trigger, because we'll have already aborted in the child on
the original errp, giving developers the best stack trace.

if the caller passed &error_fatal, we redirect errp.  auto-cleanup will
then error_propagate that back to the caller, producing as much nice
information as possible.

if the caller passed anything else, we keep errp unchanged, so no extra
error_propagate in the mix.

> 
> So I'd really like to restrict the use of error_propagate() to places
> where it's absolutely necessary. Unless, of course, you can fix these
> practical problems that error_propagate() causes for debugging.
> 
> In fact, in the context of Greg's series, I think we really only need to
> support hints for error_fatal, which are cases that users are supposed
> to see. We should exclude error_abort in MAKE_ERRP_SAFE() because these
> are things that are never supposed to happen. A good stack trace is more
> important there than adding a hint to the message.

We also want to handle the caller passing NULL, so that we no longer
have to introduce 'Error *local_error = NULL' everywhere.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [Qemu-devel] [RFC] error: auto propagated local_err

Reply via email to