rsmith added inline comments.
================ Comment at: include/clang/Basic/AttrDocs.td:3513 + int x = 2; + int &y = x; // warning: 'noderef' can only be used on an array or pointer type + ---------------- Should `noderef` appear somewhere in this example? :) ================ Comment at: include/clang/Basic/DiagnosticSemaKinds.td:9452-9453 + "dereferencing expression marked as 'noderef'">, InGroup<NoDeref>; +def warn_noderef_on_non_pointer_or_array : Warning< + "'noderef' can only be used on an array or pointer type">, InGroup<NoDeref>; } // end of sema component. ---------------- Putting this under `-Wonderef` doesn't really seem appropriate to me -- you're using `-Wnoderef` above to mean "warn about cases where a `noderef` pointer is dereferenced", whereas this warning is about invalid use of an attribute. I think `-Wignored-attributes` (`InGroup<IgnoredAttributes>`) would be a better choice. ================ Comment at: lib/Parse/ParseExpr.cpp:1126 + + Actions.StartCheckingNoDeref(); + ---------------- leonardchan wrote: > rsmith wrote: > > leonardchan wrote: > > > rsmith wrote: > > > > This parser-driven start/stop mechanism will not work in C++ templates. > > > > Instead, can you remove the "start" part and check the noderef exprs as > > > > part of popping the ExprEvaluationContextRecord? > > > I'm not sure if I should remove the the start and stop methods because > > > for a regular C program, the Push/PopExprEvaluationContextRecord > > > functions don't seem to be called, and even when they are called in C++, > > > the initial record that exists on the stack isn't popped at all. > > > > > > Since pending noderef expressions are still parsed and added to the last > > > record during template instantiation, doing another check when popping > > > covers all noderef exprs added during instantiation. > > `PushExpressionEvaluationContext` / `PopExpressionEvaluationContext` are > > called regardless of which language we're parsing. If we're missing > > `ExpressionEvaluationContext` records around some expression parsing, we > > should fix that. We should never be creating expressions within the initial > > `ExpressionEvaluationContext` record (except perhaps during error recovery). > > > > > Since pending noderef expressions are still parsed and added to the last > > > record during template instantiation, doing another check when popping > > > covers all noderef exprs added during instantiation. > > > > That's not how template instantiation works in Clang. We don't re-parse, we > > perform a recursive tree transformation that does not involve the parser. > So when should a new `ExpressionEvaluationContext` be pushed or popped? > > For the following code: > > ``` > #define NODEREF __attribute__((noderef)) > > void func(){ > int NODEREF *x; > *x; > } > > int main(){ > func(); > } > ``` > > A new context is pushed then popped in C++ but not for C. From what I can > tell based off my observations and looking for where `Push/Pop` get called in > code, new ones would get added when we enter a new GNU style statement > expression, captured region after a pragma, or different error blocks. Hmm, it looks like we don't push/pop expression evaluation context records for function definitions or variable initializers currently (instead we have a single global "potentially evaluated" context which wraps all such contexts and appears to never be popped). The fact that we share a single record for all functions / variables is likely simply because it never previously mattered that we were merging notionally-distinct contexts in this way. I suspect actually fixing that will uncover more problems (most likely, pre-existing bugs, but barriers to your progress nonetheless). ================ Comment at: lib/Parse/ParseStmt.cpp:102-104 + Actions.PushExpressionEvaluationContext( + Actions.ExprEvalContexts.back().Context); ParenBraceBracketBalancer BalancerRAIIObj(*this); ---------------- Hmm, we call `ParseStatementOrDeclaration` rather a lot. Can we pull this up into its ultimate callers instead: * `Parser::ParseFunctionDefinition` * `Parser::ParseLateTemplatedFuncDef` * `Parser::ParseLexedMethodDef` * `Parser::ParseLexedObjCMethodDefs` ? Actually, we can do better than that: those functions all call either `ActOnStartOfFunctionDef` or `ActOnStartOfObjCMethodDef` first, and `ActOnFinishFunctionBody` when they're done. We should put the `PushExpressionEvaluationContext` and `PopExpressionEvaluationContext` calls in those functions instead. We're missing at least one other place: when parsing the initializer for a global variable, in C++ we call `Sema::ActOnCXXEnterDeclInitializer`, which pushes a potentially-evaluated context. But in C, the `InitializerScopeRAII` object (in `Parser::ParseDeclaratoinAfterDeclaratorAndAttributes`) doesn't call into `Sema` and we don't ever push a suitable expression evaluation context. You can find if any other places are missing by removing `Sema`'s global `ExpressionEvaluationContext` and adding an assert that fires if we try to parse an expression with no `ExpressionEvaluationContext`, and then running the unit tests. (And we should check in that change to ensure this doesn't regress.) ================ Comment at: lib/Sema/SemaExpr.cpp:4289 + + if (TypeHasNoDeref(ResultTy)) { + LastRecord.PossibleDerefs.insert(E); ---------------- leonardchan wrote: > rsmith wrote: > > Do you ensure that the `noderef` attribute will be on the innermost level > > of the array type? I think you'll need to do so in order to warn on: > > > > ``` > > typedef int A[32]; > > typedef A __attribute__((noderef)) *B; > > int f(B b) { return (*B)[1]; } > > ``` > > > > (Here, we have a pointer to a noderef annotated array of > > non-noderef-annotated int. So I think we will not emit a warning from the > > first dereference, because we have a pointer to an array, and we will not > > emit a warning from the second dereference in the array indexing, because > > the result type does not have the noderef attribute.) > Hmmmm, so normally in order to check what's correct, I usually run these > examples through `sparse` since that's the tool that actually checks > `noderef` for gcc, but it seems that sparse instead diagnoses a warning on > the array indexing instead and nothing for the first dereference. > > This shouldn't be though since, as you pointed out, the array does not have > `noderef` types. For a simple array, > > ``` > int __attribute__((noderef)) x[10]; > x[0]; > ``` > > `sparse` diagnoses the appropriate warning for the array index. Personally > though, I would chalk this up to an edge case that wasn't thought of before > in sparse, since something like this isn't handled on their existing > validation tests. > > Currently, this diagnoses a warning on the first dereference, but I can also > understand why it shouldn't warn because accessing `noderef` structs > shouldn't warn if the member accessed is an array. The only case though this > applies in sparse's tests are with structs and they don't provide a test for > dereferencing a pointer to an array. > > I think the reason for why the struct example still passes is that if the > member is an array, none of the actual data in the struct is being accessed. > Instead, the value returned is just a pointer to the start of the array > within the struct and equivalent to just adding an offset to the struct > pointer. I think that no warning should also be thrown for this example if it > is guaranteed that array `A` can similarly be found from simple pointer > arithmetic on pointer `B`. I don't think it can unless B were an array or > struct also, but I could be wrong or overthinking this. Arrays-of-arrays and structs-containing-arrays are the same in this regard: in both cases, accessing the first level (array element or struct member) gives you an array lvalue, and only accessing that second array will actually dereference memory. So I think the treatment of the two cases should be consistent. For member access, you defer the check if the element type is an array, and I think you should do the same thing here. There are two consequences of this: 1) If the array element type is an array, you should bail out of here and not add `E` to `PossibleDerefs`. 2) If the array type (not the element type) has the `noderef` attribute, you should propagate it down to the array element type so that we can catch a later pointer dereference / array indexing operation. For point 2, what this means in my example is that the expression `*B` should have type "array of 32 noderef ints" so that further noderef checks apply to it. The easiest way to accomplish this would be to check, when applying noderef to a type, whether the type is an array type, and if so also apply noderef to the array's element type. ================ Comment at: lib/Sema/SemaExpr.cpp:4164 +bool Sema::TypeHasNoDeref(QualType Ty) { return Ty->hasAttr(attr::NoDeref); } + ---------------- If it makes sense for this to exist at all, it should be a member of `Type` instead. ================ Comment at: lib/Sema/SemaExpr.cpp:4257-4259 + auto FoundExpr = LastRecord.PossibleDerefs.find(StrippedExpr); + if (FoundExpr != LastRecord.PossibleDerefs.end()) + LastRecord.PossibleDerefs.erase(*FoundExpr); ---------------- Replace these three lines with just ``` LastRecord.PossibleDerefs.erase(StrippedExpr); ``` ================ Comment at: test/Frontend/noderef_on_non_pointers.cpp:4 +/** + * Test 'noderef' attribute against other pointer-like types. + */ ---------------- I'd like to see a test that we get a wraning for attempting to bind a reference to a dereferenced `noderef` pointer. In C++, should it be valid to assign a `noderef` pointer to a dereferenceable pointer without a cast? Repository: rC Clang https://reviews.llvm.org/D49511 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits