Re: [Qemu-devel] [RFC 00/15] Error API: Flag errors in *errp even if errors are being ignored

Markus Armbruster Wed, 28 Jun 2017 02:06:41 -0700

Eduardo Habkost <ehabk...@redhat.com> writes:

> Rationale
> ---------
>
> I'm often bothered by the fact that we can't write the following:
>
>     foo(arg, errp);
>     if (*errp) {
>         handle the error...
>         error_propagate(errp, err);
>     }
>
> because errp can be NULL.


If foo() additionally returned an indication of success, you could write

      if (!foo(arg, errp)) {    // assuming foo() returns a bool
          handle the error...
      }

Nicely concise.

For what it's worth, this is how GLib wants GError to be used.  We
deviated from it, and it has turned out to be a self-inflicted wound.

> I understand the reason we need to support errp==NULL, as it
> makes life simpler for callers that don't want any extra error
> information.  However, this has the cost of making the functions
> that report errors more complex and error-prone.
>
> (Evidence of that: the 34 ERR_IS_* cases handled by the "use
> ERR_IS_* macros" patches in the series.  Where existing code will
> crash or behave differently if errp is NULL.)

Which of them could *not* use a suitable return value instead of *errp?

> I considered suggesting forbidding NULL errp, and just changing
> all callers that use NULL to have an err variable and call
> error_free(), but this would mean changing 690 function callers
> that pass NULL errp as argument.

Would also add lots of pointless malloc churn.  The ability to ignore
errors is a feature.

> Here I'm proposing a mechanism to have the best of both worlds:
> allow callers to ignore errors easily while allowing functions to
> propagate errors without an intermediate local_err variable.
>
> The Proposal
> ------------
>
> I'm proposing replacing NULL errp with a special macro:
> IGNORE_ERRORS.  The macro will trigger special behavior in the
> error API that will make it not save any error information in the
> error pointer, but still keep track of boolean error state in
> *errp.
>
> This will allow us to simplify the documented method to propagate errors
> from:
>
>     Error *err = NULL;
>     foo(arg, &err);
>     if (err) {
>         handle the error...
>         error_propagate(errp, err);
>     }
>
> to:
>
>     foo(arg, errp);
>     if (ERR_IS_SET(errp)) {
>         handle the error...
>     }
>
> This will allow us to stop using local_err variables and
> error_propagate() on hundreds of cases.

Getting rid of unnecessary local_err boilerplate is good.  The question
is how.  A possible alternative to your proposal is to follow GLib and
make functions return success/failure.

How do the two compare?

> Implementation
> --------------
>
> This replaces NULL errp arguments on function calls with a
> IGNORE_ERRORS macro.  Checks for (!errp) are replaced by
> ERR_IS_IGNORED(errp).  Checks for (*errp) are replaced by
> ERR_IS_SET(errp).  No extra changes are required on function
> callers.

That's a lot of churn.  One time pain, of course.

> Then IGNORE_ERRORS is implemend as:
>
>   (& { &ignored_error_unset })
>
> When error_set() is called and IGNORE_ERRORS was used, we set
> error state using:
>
>   *errp = &ignored_error_set;
>
> This way, we can make ERR_IS_SET work even if errp was
> IGNORE_ERRORS.  The ERR_IS_* macros are reimplemented as:
>
>   #define ERR_IS_SET(e) (*(e) && *(e) != &ignored_error_unset)
>   #define ERR_IS_IGNORED(e) (!(e) || *(e) == &ignored_error_unset || *(e) == 
> &ignored_error_set)
>
> Ensuring errp is never NULL
> ---------------------------
>
> The last patch on this series changes the (Error **errp)
> parameters in functions to (Error *errp[static 1]), just to help
> validate the existing code, as clang warns about NULL arguments
> on that case.  I don't think we should apply that patch, though,
> because the "[static 1]" syntax confuses Coccinelle.
>
> I have a branch where I experimented with the idea of replacing
> (Error **errp) parameters with an opaque type (void*, or a struct
> type).  I gave up when I noticed it would require touching all
> callers to replace &err with a wrapper macro to convert to the
> right type.  Suggestions to make NULL errp easier to detect at
> build time are welcome.
>
> (Probably the easiest solution for that is to add assert(errp)
> lines to the ERR_IS_*() macros.)

We'll obviously struggle with null arguments until all the developers
adjusted to the new interface.  Possibly with occasional mistakes
forever.  Compile-time checking would really, really help.

> Desirable side-effects
> ----------------------
>
> Some of additional benefits from parts of this series:
>
> * Making code that ignore error information more visible and
>   greppable (using IGNORE_ERRORS).

True.

Drawback: Passing an address takes more code than passing null.  Not
sure it matters.

>                                     I believe many of those cases
>   are actually bugs and should use &error_abort or &error_fatal
>   instead.

I've seen such bugs.

Of course, making possible offenders more greppable doesn't necessarily
mean existing ones will get fixed and new ones will be avoided.

> * Making code that depends on errp more visible and
>   greppable (using ERR_IS_* macros).  Some of those cases are
>   also likely to be bugs, and need to be investigated.

Grepping for (local_)?errp? works well enough, doesn't it?

> TODO
> ----
>
> * Simplify more cases of local_error/error_propagate() to use
>   errp directly.
> * Update API documentation and code examples.
> * Add a mechanism to ensure errp is never NULL.
>
> Git branch
> ----------
>
> This series depend on a few extra cleanups that I didn't submit
> to qemu-devel yet.  A git branch including this series is
> available at:
>
>   git://github.com/ehabkost/qemu-hacks.git work/err-api-rework-ignore-ptr-v1

Re: [Qemu-devel] [RFC 00/15] Error API: Flag errors in *errp even if errors are being ignored

Reply via email to