[PATCH] D153059: [-Wunsafe-buffer-usage] Group parameter fix-its

Ziqing Luo via Phabricator via cfe-commits Wed, 21 Jun 2023 12:57:11 -0700

ziqingluo-90 added inline comments.


================
Comment at: clang/lib/Analysis/UnsafeBufferUsage.cpp:2077
+
   for (const auto &[VD, Fixables] : FixablesForUnsafeVars.byVar) {
     FixItsForVariable[VD] =
----------------
NoQ wrote:
> There's a bug in variable naming: `FixablesForUnsafeVars`actually contains 
> fixables for *all* variables. Rashmi correctly renames it in D150489 to 
> reflect its actual behavior. So IIUC you're generating fixits for all 
> variables here, which is probably not what you want.
Thanks for pointing out the incorrectness of the variable name.   Actually, 
whether a variable is unsafe is irrelevant to this function.  The function only 
cares about all variables that need fixes.  So I think the incorrect name of 
the variable does not affect the functionality of my changes here.


================
Comment at: clang/lib/Analysis/UnsafeBufferUsage.cpp:2278-2281
+          if (isParameterOf(Impl.first, D))
+            ParmsNeedFix.insert(Impl.first);
+          if (isParameterOf(Impl.second, D))
+            ParmsNeedFix.insert(Impl.second);
----------------
NoQ wrote:
> Does this really go in both directions? Shouldn't it be just one direction?
> 
> ```lang=c++
> void foo(int *p) {
>   int *q = new int[10];
>   p = q;
>   q[5] = 7;
> }
> ```
> In this case `q` is an unsafe local buffer, but `p` is a (mostly) safe 
> single-object pointer that doesn't need fixing, despite implication from `q` 
> to `p`. Of course this example is somewhat unrealistic (people don't usually 
> overwrite their parameters), but it is also the only situation where the 
> other direction matters.
You are right!  It should be one-direction.


================
Comment at: clang/lib/Analysis/UnsafeBufferUsage.cpp:2293
+    //      search of connected components.
+    if (!ParmsNeedFix.empty()) {
+      auto First = ParmsNeedFix.begin(), Last = First;
----------------
NoQ wrote:
> What about the situation where params aren't seen as unsafe yet, but they're 
> discovered to be unsafe later? Eg.
> ```
> void foo(int *p, int *q) {
>   int *p2 = p;
>   p2[5] = 7;
>   int *q2 = q;
>   q2[5] = 7;
> }
> ```
> Will we notice that `p` and `q` need to be fixed simultaneously?
> 
> ---
> 
> I suspect that the problem of parameter grouping can't be solved by 
> pre-populating strategy implications between all parameters. It'll either 
> cause you to implicate all parameters (even the ones that will never need 
> fixing), or cause you to miss connections between parameters that do need 
> fixing (as the connection is discovered later).
> 
> Connection between parameters needs to be added *after* the implications 
> algorithm has finished. And once it's there (might be already there if I 
> missed something), then this part of code probably becomes unnecessary.
Yes.  They will be noticed.  Here is why:

The code here first forms a ring over `p` and  `q` in the assignment (directed) 
graph:  
```
p <--> q
```
Then the two initialization statements (implemented in [[ 
https://reviews.llvm.org/D150489 | D150489 ]]) add two more edges to the graph:
```
p2 --> p <--> q <-- q2
```
The algorithm later searches the graph starting from unsafe variables `p2` and 
`q2`, respectively,  Starting from `p2`, the algorithm discovers that `p2` and 
`p`, as well as `p` and `q` depend on each other.  Starting from `q2`,  the 
algorithm discovers that `q2` and `q`, as well as `p` and `q` depend on each 
other.   The dependency graph is sort of undirected. So eventually, the four 
variables `p2`, `p`, `q`, `q2` are in the same group.



================
Comment at: 
clang/test/SemaCXX/warn-unsafe-buffer-usage-fixits-multi-parm-span.cpp:13-15
+   expected-note{{change type of 'p' to 'std::span' to preserve bounds 
information, and change 'q' and 'a' to 'std::span' to propagate bounds 
information between them}} \
+   expected-note{{change type of 'q' to 'std::span' to preserve bounds 
information, and change 'p' and 'a' to 'std::span' to propagate bounds 
information between them}} \
+   expected-note{{change type of 'a' to 'std::span' to preserve bounds 
information, and change 'p' and 'q' to 'std::span' to propagate bounds 
information between them}}
----------------
NoQ wrote:
> QoL thing: //"to propagate bounds information between them"// isn't the 
> correct reason in this case. Bounds information doesn't propagate between `p` 
> and `q` and `a`. Each of them carries its own, completely unrelated bounds 
> information.
> 
> We need to come up with a different reason. Or it might be better to combine 
> the warnings into one warning here, so that we didn't need a reason:
> ```
> warning: 'p', 'q' and 'a' are unsafe pointers used for buffer access
> note: change type of 'p', 'q' and 'a' to 'std::span' to preserve bounds 
> information
> ```
> 
> This gets a bit worse when parameters are implicated indirectly, but it still 
> mostly works fine, for example:
> ```
> void foo(int *p, int *q) {
>   int *p2 = p;
>   p2[5] = 7;
>   int *q2 = q;
>   q2[5] = 7;
> }
> ```
> would produce:
> ```
> warning: 'p2' and 'q2' are unsafe pointers used for buffer access
> note: change type of 'p2' and 'q2' to 'std::span' to preserve bounds 
> information, and
>       change 'p' and 'q' to 'std::span' to propagate bounds information 
> between them
> ```
> 
> ---
> 
> In any case, this shows that the relationship between parameters (grouped 
> together simply for being parameters) is very different from the relationship 
> within groups of variables implicated by assignments (even if two or more of 
> them are parameters). All the way down to the warning/note printer, we 
> probably need to continue giving the parameter relationship a special 
> treatment.
This is a good argument!  An edge from `p` to `q` in an assignment graph stands 
for not only that fixing `p` implicates fixing `q` but also that the bounds 
information is propagated from `q` to `p`.   In this sense, the way I connect 
parameters is inappropriate.

But my concern is that although treating parameters specially could improve the 
quality of the diagnostic messages, the code may be more complex and less 
re-used.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153059/new/

https://reviews.llvm.org/D153059

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D153059: [-Wunsafe-buffer-usage] Group parameter fix-its

Reply via email to