Peter Crosthwaite <peter.crosthwa...@xilinx.com> writes: > On Fri, Apr 11, 2014 at 11:41 PM, Markus Armbruster <arm...@redhat.com> wrote: >> Peter Crosthwaite <peter.crosthwa...@xilinx.com> writes: >> >>> On Thu, Apr 10, 2014 at 1:48 AM, Markus Armbruster <arm...@redhat.com> >>> wrote: >>>> I stumbled over this while trying to purge error_is_set() from the code. >>>> >>>> >>>> Here's how we commonly use the Error API: >>>> >>>> Error *err = NULL; >>>> >>>> foo(arg, &err) >>>> if (err) { >>>> goto out; >>>> } >>>> bar(arg, &err) >>>> if (err) { >>>> goto out; >>>> } >>>> >>>> This ensures that err is null on entry, both for foo() and for bar(). >>>> Many functions rely on that, like this: >>>> >>>> void foo(ArgType arg, Error **errp) >>>> { >>>> if (frobnicate(arg) < 0) { >>>> error_setg(errp, "Can't frobnicate"); >>>> // This asserts errp != NULL >>>> } >>>> } >>>> >>>> >>>> Here's how some of our visitor code uses the Error API (for real code, >>>> check out generated qmp-marshal.c): >>>> >>>> Error *err = NULL; >>>> QmpInputVisitor *mi = qmp_input_visitor_new_strict(QOBJECT(args)); >>>> Visitor *v = qmp_input_get_visitor(mi); >>>> char *foo = NULL; >>>> char *bar = NULL; >>>> >>>> visit_type_str(v, &foo, "foo", &err); >>>> visit_type_str(v, &bar, "bar", &err); >>>> if (err) { >>>> goto out; >>>> } >>>> >>>> Unlike above, this may pass a non-null errp to the second >>>> visit_type_str(), namely when the first one fails. >>>> >>>> The visitor functions guard against that, like this: >>>> >>>> void visit_type_str(Visitor *v, char **obj, const char *name, Error >>>> **errp) >>>> { >>>> if (!error_is_set(errp)) { >>>> v->type_str(v, obj, name, errp); >>>> } >>>> } >>>> >>>> As discussed before, error_is_set() is almost almost wrong, fragile or >>>> unclean. What if errp is null? Then we fail to stop visiting after an >>>> error. >>> >>> That should be the callers problem. If you pass a NULL errp then the >>> intended semantic is to ignore errors. >> >> The *caller* isn't interested in an error. But the callee's behavior >> should *not* be affected by that at all other than not returning an >> error. >> >> In particular, the callee should never continue after an error just >> because the caller isn't interested in detailed error information. >> > > But the error is from a previous call not the current call. Callers > job to inform second call that first one failed (or in current status > quo - not call the second call at all). But its caller job to know the > dependancy otherwise calls are not self contained. > >> That's why "if (error_is_set(errp)) bail" and similar are always wrong >> when errp is a parameter that may be null. >> > > Agreed. Don't see the problem here though - it's bad caller code too. > >>>> The function could be improved like this: >>>> >>>> void visit_type_str(Visitor *v, char **obj, const char *name, Error >>>> **errp) >>>> { >>>> assert(errp); >>> >>> And this is irregular in that you are now mandating the Error ** and >>> thus removing the capability to ignore errors. >> >> It is irregular. The irregularity is necessary as long as the function >> isn't prepared for a null errp. >> > > My understanding is all functions should be prepared for NULL errp.
If you combine "do nothing when called with an error set" semantics with a "pass null to ignore errors" convention, you set yourself up for accidental misuse. Consider visit_type_str(v, &foo, "foo", errp); visit_type_str(v, &bar, "bar", errp); where visit_type_str() does nothing when error_is_set(errp). This *breaks* when errp is null, but seeing that takes a non-local argument. Here's a simple safety rule: if you do anything with an Error * but pass it on, you must ensure it's not null, and you should make its non-null-ness locally obvious whenever practical. This rule should ensure that you can pass null to ignore errors without changing behavior beyond ignoring the error. >>>> if (!*errp) { >>>> v->type_str(v, obj, name, errp); >>>> } >>>> } >>>> >>>> >>>> But: is it a good idea to have both patterns in the code? Should we >>>> perhaps use the common pattern for visiting, too? Like this: >>>> >>>> visit_type_str(v, &foo, "foo", &err); >>>> if (err) { >>>> goto out; >>>> } >>>> visit_type_str(v, &bar, "bar", &err); >>>> if (err) { >>>> goto out; >>>> } >>>> >>>> Then we can assume *errp is clear on function entry, like this: >>>> >>>> void visit_type_str(Visitor *v, char **obj, const char *name, Error >>>> **errp) >>>> { >>>> v->type_str(v, obj, name, errp); >>>> } >>>> >>>> Should execute roughly the same number of conditional branches. >>>> >>>> Tedious repetition of "if (err) goto out" in the caller, but that's what >>>> we do elsewhere, and unlike elsewhere, these one's are generated. >>>> >>>> Opinions? >>> >>> I think this code as-is is a good example of what we should do >>> elsewhere. The code base has bloated with the if (error) { bail; } on >>> every Error ** accepting API call. I proposed a while back a semantic >>> that Error ** Accepting APIs perform no action when the error is >>> already set to allow for long sequences of calls to run without the >>> constant checks. You then report the first error in a catchall at the >>> end of the run. >>> >>> I think this particular code is probably good, provided your case of >>> NULL errp is enforced against by the caller. >> >> My point isn't that this technique is bad, only that it's different from >> what we do everywhere else, and the two techniques do not combine well. >> >> Here's how we handle errors everywhere else: >> >> void frob([...], Error **errp) >> { >> Error *err = NULL; >> >> foo(&err) >> if (err) { >> goto out; >> } >> bar(&err) >> if (err) { >> goto out; >> } >> [...] >> out: >> error_propagate(errp, err); >> [...] >> } >> >> Both foo() and bar() are never entered with an error set. Consequently, >> they don't check for that case. >> >> If you screw up and call them with an error set, they'll die on the >> first error of their own, because error_set() asserts the error is >> clear. >> >> You might be tempted to elide err, like this: >> >> foo(errp) >> if (error_is_set(errp)) { >> goto out; >> } >> bar(errp) >> if (error_is_set(errp)) { >> goto out; >> } >> [...] >> out: >> [...] >> >> But that's *wrong*, because it executes bar() after foo() failed when >> errp is null. Ignoring errors from frob() also changes what frob() >> does! Not an acceptable interface. >> >> You can elide err only in cases where all you ever do with it is pass it >> on unexamined. >> > > Yeh so you must define an Error * locally in cases where you want the > first call failure to inhibit the second. I think this is reasonable > when its the callers wish (frob()) to behave so. With error_is_set() purged, mistakes will hopefully be fairly obvious, since "if (*errp)" begs the question "what if !errp?" much more than error_is_set(errp) does, and makes mistakes crash hard instead of sweeping them under the carpet. >> An alternative technique is to partly move error checks into the >> callees, like this: >> >> err = NULL; >> foo(&err) >> bar(&err) >> if (err) { >> goto out; >> } >> [...] >> out: >> error_propagate(errp, err); >> [...] >> >> Now bar() must not do anything when called with an error set. It needs >> to begin with code like this: >> >> if (error_is_set(errp) { >> return; >> } >> > > The other compormise is to only bail the second function in its own > error case. Let it run as normal and when it encounters an already set > error, ignore it. This is useful when you have lots of independant > calls one after the other . For an extreme example, check the > ppc/e500.c device tree API calls which I want to convert to Error API > but having to "if (err) { bail };" every one of them is not going to > fly. This is combining errors from independent calls. The current interfaces are designed for a more common pattern: first error in a chain of dependent calls must be handled. If we actually have a need for combining errors, and the current interfaces turn out to be too awkward for that, then let's extend them, but without compromising their usability for the common pattern. >> Trades bloating fewer call sites against bloating all the functions. >> Tradeoff. >> > > In many API cases, number of callers greatly outnumbers number of functions. Yes, but you'd still have to check for errors at many call sites. To see how many, we'd need a diff. >> Note that adding the check above to bar() makes bar() less suited to the >> technique I described first. Remember, with that technique, bar() must >> not be called with an error set, and we rely on an assertion to protect >> us against mistakes. The check above completely bypasses the assertion. >> >> Moreover, having both techniques in the tree is bound to lead to >> confusion. Do I have to check after my function call? Do I have to >> check before doing anything in my function? Sounds like a surefire >> recipe for error handling botches to me. >> >> This is what I mean by "the two techniques do not combine well". Let's >> pick one and stick to it. >> >> The first technique is overwhelmingly prevalent now. The only holdout >> of the second technique is some QAPI-related code. >> >> I'm going to try converting that to the first technique. If you can >> come up with patches converting everything else to the second technique, >> we can discuss which one is better :) >> > > Yeh one day. Will be an awesome -ve diffstat if it happens though. Maybe, maybe not (in more than one way) :)