rsmith added inline comments.

================
Comment at: include/clang/Basic/AttrDocs.td:3513
+  int x = 2;
+  int &y = x;  // warning: 'noderef' can only be used on an array or pointer 
type
+
----------------
Should `noderef` appear somewhere in this example? :)


================
Comment at: include/clang/Basic/DiagnosticSemaKinds.td:9452-9453
+  "dereferencing expression marked as 'noderef'">, InGroup<NoDeref>;
+def warn_noderef_on_non_pointer_or_array : Warning<
+  "'noderef' can only be used on an array or pointer type">, InGroup<NoDeref>;
 } // end of sema component.
----------------
Putting this under `-Wonderef` doesn't really seem appropriate to me -- you're 
using `-Wnoderef` above to mean "warn about cases where a `noderef` pointer is 
dereferenced", whereas this warning is about invalid use of an attribute. I 
think `-Wignored-attributes` (`InGroup<IgnoredAttributes>`) would be a better 
choice.


================
Comment at: lib/Parse/ParseExpr.cpp:1126
+
+    Actions.StartCheckingNoDeref();
+
----------------
leonardchan wrote:
> rsmith wrote:
> > leonardchan wrote:
> > > rsmith wrote:
> > > > This parser-driven start/stop mechanism will not work in C++ templates. 
> > > > Instead, can you remove the "start" part and check the noderef exprs as 
> > > > part of popping the ExprEvaluationContextRecord?
> > > I'm not sure if I should remove the the start and stop methods because 
> > > for a regular C program, the Push/PopExprEvaluationContextRecord 
> > > functions don't seem to be called, and even when they are called in C++, 
> > > the initial record that exists on the stack isn't popped at all.
> > > 
> > > Since pending noderef expressions are still parsed and added to the last 
> > > record during template instantiation, doing another check when popping 
> > > covers all noderef exprs added during instantiation.
> > `PushExpressionEvaluationContext` / `PopExpressionEvaluationContext` are 
> > called regardless of which language we're parsing. If we're missing 
> > `ExpressionEvaluationContext` records around some expression parsing, we 
> > should fix that. We should never be creating expressions within the initial 
> > `ExpressionEvaluationContext` record (except perhaps during error recovery).
> > 
> > > Since pending noderef expressions are still parsed and added to the last 
> > > record during template instantiation, doing another check when popping 
> > > covers all noderef exprs added during instantiation.
> > 
> > That's not how template instantiation works in Clang. We don't re-parse, we 
> > perform a recursive tree transformation that does not involve the parser.
> So when should a new `ExpressionEvaluationContext` be pushed or popped? 
> 
> For the following code:
> 
> ```
> #define NODEREF __attribute__((noderef))
> 
> void func(){
>   int NODEREF *x;
>   *x;
> }
> 
> int main(){
>   func();
> }
> ```
> 
> A new context is pushed then popped in C++ but not for C. From what I can 
> tell based off my observations and looking for where `Push/Pop` get called in 
> code, new ones would get added when we enter a new GNU style statement 
> expression, captured region after a pragma, or different error blocks.
Hmm, it looks like we don't push/pop expression evaluation context records for 
function definitions or variable initializers currently (instead we have a 
single global "potentially evaluated" context which wraps all such contexts and 
appears to never be popped).

The fact that we share a single record for all functions / variables is likely 
simply because it never previously mattered that we were merging 
notionally-distinct contexts in this way. I suspect actually fixing that will 
uncover more problems (most likely, pre-existing bugs, but barriers to your 
progress nonetheless).


================
Comment at: lib/Parse/ParseStmt.cpp:102-104
+  Actions.PushExpressionEvaluationContext(
+      Actions.ExprEvalContexts.back().Context);
   ParenBraceBracketBalancer BalancerRAIIObj(*this);
----------------
Hmm, we call `ParseStatementOrDeclaration` rather a lot. Can we pull this up 
into its ultimate callers instead:

  * `Parser::ParseFunctionDefinition`
  * `Parser::ParseLateTemplatedFuncDef`
  * `Parser::ParseLexedMethodDef`
  * `Parser::ParseLexedObjCMethodDefs`

? Actually, we can do better than that: those functions all call either 
`ActOnStartOfFunctionDef` or `ActOnStartOfObjCMethodDef` first, and 
`ActOnFinishFunctionBody` when they're done. We should put the 
`PushExpressionEvaluationContext` and `PopExpressionEvaluationContext` calls in 
those functions instead.

We're missing at least one other place: when parsing the initializer for a 
global variable, in C++ we call `Sema::ActOnCXXEnterDeclInitializer`, which 
pushes a potentially-evaluated context. But in C, the `InitializerScopeRAII` 
object (in `Parser::ParseDeclaratoinAfterDeclaratorAndAttributes`) doesn't call 
into `Sema` and we don't ever push a suitable expression evaluation context.

You can find if any other places are missing by removing `Sema`'s global 
`ExpressionEvaluationContext` and adding an assert that fires if we try to 
parse an expression with no `ExpressionEvaluationContext`, and then running the 
unit tests. (And we should check in that change to ensure this doesn't regress.)


================
Comment at: lib/Sema/SemaExpr.cpp:4289
+
+  if (TypeHasNoDeref(ResultTy)) {
+    LastRecord.PossibleDerefs.insert(E);
----------------
leonardchan wrote:
> rsmith wrote:
> > Do you ensure that the `noderef` attribute will be on the innermost level 
> > of the array type? I think you'll need to do so in order to warn on:
> > 
> > ```
> > typedef int A[32];
> > typedef A __attribute__((noderef)) *B;
> > int f(B b) { return (*B)[1]; }
> > ```
> > 
> > (Here, we have a pointer to a noderef annotated array of 
> > non-noderef-annotated int. So I think we will not emit a warning from the 
> > first dereference, because we have a pointer to an array, and we will not 
> > emit a warning from the second dereference in the array indexing, because 
> > the result type does not have the noderef attribute.)
> Hmmmm, so normally in order to check what's correct, I usually run these 
> examples through `sparse` since that's the tool that actually checks 
> `noderef` for gcc, but it seems that sparse instead diagnoses a warning on 
> the array indexing instead and nothing for the first dereference.
> 
> This shouldn't be though since, as you pointed out, the array does not have 
> `noderef` types. For a simple array,
> 
> ```
> int __attribute__((noderef)) x[10];
> x[0];
> ```
> 
> `sparse` diagnoses the appropriate warning for the array index. Personally 
> though, I would chalk this up to an edge case that wasn't thought of before 
> in sparse, since something like this isn't handled on their existing 
> validation tests.
> 
> Currently, this diagnoses a warning on the first dereference, but I can also 
> understand why it shouldn't warn because accessing `noderef` structs 
> shouldn't warn if the member accessed is an array. The only case though this 
> applies in sparse's tests are with structs and they don't provide a test for 
> dereferencing a pointer to an array.
> 
> I think the reason for why the struct example still passes is that if the 
> member is an array, none of the actual data in the struct is being accessed. 
> Instead, the value returned is just a pointer to the start of the array 
> within the struct and equivalent to just adding an offset to the struct 
> pointer. I think that no warning should also be thrown for this example if it 
> is guaranteed that array `A` can similarly be found from simple pointer 
> arithmetic on pointer `B`. I don't think it can unless B were an array or 
> struct also, but I could be wrong or overthinking this.
Arrays-of-arrays and structs-containing-arrays are the same in this regard: in 
both cases, accessing the first level (array element or struct member) gives 
you an array lvalue, and only accessing that second array will actually 
dereference memory. So I think the treatment of the two cases should be 
consistent. For member access, you defer the check if the element type is an 
array, and I think you should do the same thing here.

There are two consequences of this:

1) If the array element type is an array, you should bail out of here and not 
add `E` to `PossibleDerefs`.
2) If the array type (not the element type) has the `noderef` attribute, you 
should propagate it down to the array element type so that we can catch a later 
pointer dereference / array indexing operation.

For point 2, what this means in my example is that the expression `*B` should 
have type "array of 32 noderef ints" so that further noderef checks apply to 
it. The easiest way to accomplish this would be to check, when applying noderef 
to a type, whether the type is an array type, and if so also apply noderef to 
the array's element type.


================
Comment at: lib/Sema/SemaExpr.cpp:4164
 
+bool Sema::TypeHasNoDeref(QualType Ty) { return Ty->hasAttr(attr::NoDeref); }
+
----------------
If it makes sense for this to exist at all, it should be a member of `Type` 
instead.


================
Comment at: lib/Sema/SemaExpr.cpp:4257-4259
+  auto FoundExpr = LastRecord.PossibleDerefs.find(StrippedExpr);
+  if (FoundExpr != LastRecord.PossibleDerefs.end())
+    LastRecord.PossibleDerefs.erase(*FoundExpr);
----------------
Replace these three lines with just

```
LastRecord.PossibleDerefs.erase(StrippedExpr);
```


================
Comment at: test/Frontend/noderef_on_non_pointers.cpp:4
+/**
+ * Test 'noderef' attribute against other pointer-like types.
+ */
----------------
I'd like to see a test that we get a wraning for attempting to bind a reference 
to a dereferenced `noderef` pointer.

In C++, should it be valid to assign a `noderef` pointer to a dereferenceable 
pointer without a cast?


Repository:
  rC Clang

https://reviews.llvm.org/D49511



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
  • [PATCH] D49511: [... Richard Smith - zygoloid via Phabricator via cfe-commits

Reply via email to