sammccall added subscribers: gribozavr, klimek.
sammccall added a comment.

In D61722#1508050 <https://reviews.llvm.org/D61722#1508050>, @rsmith wrote:

> The simplest thing to do might be to ensure that `vector<ErrorTy>` is itself 
> `ErrorTy`.


You're probably right, I'll try this.

> For example, the constant evaluator will want to quickly bail out of 
> evaluating expressions and statements containing errors

Thinking more - how important is this? These expressions are expected to be 
rare.
Discussing offline with @gribozavr, it seemed likely we could get away with the 
"errorness" of an expr being a local property, which would be simplifying. 
(Whereas the errorness of a type needs to be a transitive one).

> What kinds of cases are you thinking of that would be in your second category?

Incomplete code, either new code or during refactoring

- not enough arguments
- symbol not found due to missing #include
- member not added yet
- function not defined yet

In an IDE context, it's *much* more important that diagnostics are useful and 
AST nodes aren't discarded, than that recovery correctly "fixes" the AST to 
have the intended semantics.

From that point of view, typo recovery/TypoExpr spends too much complexity 
budget on accurate recovery, rather than generality. RecoveryExpr is a 
different part of the design space for sure.

> I would assume that we would represent a function call for which the callee 
> or an argument contains an error as a `CallExpr` with the "contains error" 
> flag set and with `ErrorTy` as its type.

Almost:

- if the argument has an error, but its type is still known, then overload 
resolution runs as normal and this CallExpr doesn't itself have an error. (If 
there's a **transitive** HasError bit on Expr, it would be set).
- if the argument has an error and has ErrorTy, overload resolution may still 
succeed (depending on the rules we choose for ErrorTy). In which case again, 
the CallExpr itself has no error.
- if the argument has an error and has ErrorTy, and this causes overload 
resolution to fail, but all (best) overloads have the same type, we get a 
RecoveryExpr/CallExpr-with-error with a real type
- if the argument has an error and has ErrorTy, and this causes overload 
resolution to fail, and the type is unknown, then we get a 
RecoveryExpr/CallExpr-with-error with ErrorTy.

> Could we use the same representation here? (That is, represent this case as a 
> `CallExpr` whose type is `ErrorTy` with the "contains error" flag set, with 
> no `RecoveryExpr`.)

We can, but it breaks a lot of client code. These two cases are pretty 
different for clients:

- for an otherwise-valid CallExpr where one argument has errors, all the 
*local* invariants are preserved.
- for a CallExpr where callee is null, or the #args don't match the function, 
the client code often crashed or did the wrong thing.

My view is that we're fundamentally better having two types to keep the 
guarantees around `CallExpr` stronger. That we want matchers to "match through" 
RecoveryExpr, but callExpr shouldn't match here.
@ilya-biryukov disagreed and thought the main virtue of `RecoveryExpr` is 
backwards-compatibility.
Either way, I'm not optimistic about being able to get recovery changes to 
stick under this model. I'm willing to try if you feel strongly.

>> access of a nonexistent member Base expression is captured, type is never 
>> available.
> 
> Using `UnresolvedMemberExpr` for this case might make sense. I think we 
> should think about how we'd represent a member access that can't be fully 
> analyzed because the object expression contains errors, and think about 
> whether it makes sense to use that same representation for the case where the 
> error is immediately in forming the member access itself. (And similarly 
> across all other kinds of recovery expression we might create.)

I agree it's important to be consistent here. It's hard work to be consistent 
and also precise!
Using RecoveryExpr for both cases seems sufficient to me, and one of its 
advantages is that it avoids having to make lots of hard decisions one-by-one, 
many of which will be wrong :-)

But that aside, I agree the best way to model that in the existing AST is make 
`UnresolvedMemberExpr` available in C, allow it to have no candidate decls, etc.
For the object expression, the error can be nested arbitrarily deep and I think 
we just have to apply whatever usual "subexpression-has-errors" strategy to 
`UnresolvedMemberExpr` and `MemberExpr`.

> If we want to enable (eg) AST matchers to operate on erroneous AST, using a 
> homogeneous `RecoveryExpr` for all different syntaxes that originate errors 
> seems like it would introduce complexity, especially if all consumers of the 
> AST already need to deal with the case that some existing AST node is marked 
> as being erroneous so don't get any benefit from having a concrete 
> `RecoveryExpr` as a known stopping point. (If you get there, you've probably 
> already gone too far.)

Based on experiments, I think the desired behavior for ASTMatchers is that 
`callExpr()` wouldn't match a broken call, but descendant/ancestor matching 
would still traverse "through" broken calls.
This gets most of the possible benefit for little cost: matchers can "for free" 
match the rest of the expression tree that gets dropped today, but e.g. 
existing clang-tidy checks won't bind directly to the broken parts of the AST.
Matchers/options to opt-in can be added. e.g. `brokenCallExpr(...)` or 
`recoveryExpr(attemptedExpr(CallExpr), ...)` as the case may be.

I thought @klimek was on board with this approach, but he can correct me if 
not...


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D61722/new/

https://reviews.llvm.org/D61722



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to