ziqingluo-90 added inline comments.
================ Comment at: clang/lib/Analysis/UnsafeBufferUsage.cpp:2293 + // search of connected components. + if (!ParmsNeedFix.empty()) { + auto First = ParmsNeedFix.begin(), Last = First; ---------------- NoQ wrote: > ziqingluo-90 wrote: > > NoQ wrote: > > > ziqingluo-90 wrote: > > > > NoQ wrote: > > > > > What about the situation where params aren't seen as unsafe yet, but > > > > > they're discovered to be unsafe later? Eg. > > > > > ``` > > > > > void foo(int *p, int *q) { > > > > > int *p2 = p; > > > > > p2[5] = 7; > > > > > int *q2 = q; > > > > > q2[5] = 7; > > > > > } > > > > > ``` > > > > > Will we notice that `p` and `q` need to be fixed simultaneously? > > > > > > > > > > --- > > > > > > > > > > I suspect that the problem of parameter grouping can't be solved by > > > > > pre-populating strategy implications between all parameters. It'll > > > > > either cause you to implicate all parameters (even the ones that will > > > > > never need fixing), or cause you to miss connections between > > > > > parameters that do need fixing (as the connection is discovered > > > > > later). > > > > > > > > > > Connection between parameters needs to be added *after* the > > > > > implications algorithm has finished. And once it's there (might be > > > > > already there if I missed something), then this part of code probably > > > > > becomes unnecessary. > > > > Yes. They will be noticed. Here is why: > > > > > > > > The code here first forms a ring over `p` and `q` in the assignment > > > > (directed) graph: > > > > ``` > > > > p <--> q > > > > ``` > > > > Then the two initialization statements (implemented in [[ > > > > https://reviews.llvm.org/D150489 | D150489 ]]) add two more edges to > > > > the graph: > > > > ``` > > > > p2 --> p <--> q <-- q2 > > > > ``` > > > > The algorithm later searches the graph starting from unsafe variables > > > > `p2` and `q2`, respectively, Starting from `p2`, the algorithm > > > > discovers that `p2` and `p`, as well as `p` and `q` depend on each > > > > other. Starting from `q2`, the algorithm discovers that `q2` and `q`, > > > > as well as `p` and `q` depend on each other. The dependency graph is > > > > sort of undirected. So eventually, the four variables `p2`, `p`, `q`, > > > > `q2` are in the same group. > > > > > > > > The code here first forms a ring over `p` and `q` in the assignment > > > > (directed) graph > > > > > > I don't think it does. The ring is formed over the `ParamsNeedFix` list, > > > which is empty because none of these parameters are listed in > > > `UnsafeOps.byVar`. > > Actually `p` and `q` will be in `ParamsNeedFix`. Because they are > > implicated by `p2` and `q2`, who are unsafe variables. `ParamsNeedFix` > > include parameters that need fix, either because they are used in unsafe > > operations or they are implicated by some Gadgets. > > > > But anyway, I now think it's not a good idea to lump parameters with > > variables grouped by implication, as you pointed out that variable > > implication also implies bounds propagation. > > > > Instead, I think it might be more appropriate to make a variable group > > structured: > > - Let's call it a //connected components// for a set of variables that > > need to be fixed together and sharing bounds information. (The term > > //connected components// has been used in the code to refer to a group of > > variables. It keeps its meaning here.) > > - Let's redefine a Variable Group to be a set of mutually disjoint > > //connected components//s (CCs). If there are more than one CCs in a > > variable group, each of the CCs contains at least one parameter. > > > > For generating fix-its for a variable group, this patch still works as it > > only requires that variables in a variable group are either all fixed or > > given up as a whole. > > For generating diagnostic messages, we now have more information about > > relations among grouped variables. > > Actually `p` and `q` will be in ParamsNeedFix. Because they are implicated > > by `p2` and `q2`, who are unsafe variables. `ParamsNeedFix` include > > parameters that need fix, either because they are used in unsafe operations > > or they are implicated by some Gadgets. > > Ohh, right, it's populated twice, I even commented on the other part but > didn't notice it populates the same list! > > But in this case you'll probably join `p` and `q` in > ``` > void foo(int *p, int *q) { > int *p2 = p; > p2[5] = 7; > int *q2 = q; > } > ``` > right? (In this case `q` is implicated by `q2` so it gets added to the list, > but `q2` itself isn't unsafe or implicated by anything, so it doesn't need > fixing, but connecting it to `p` would cause it to be fixed unnecessarily). > So I still think this can't be correct unless you do it after the crawler > finishes. > > > But anyway, I now think it's not a good idea to lump parameters with > > variables grouped by implication, as you pointed out that variable > > implication also implies bounds propagation. > > Yeah, I think it makes a lot more sense. Just look at variable groups already > built by the existing crawler, and if more than one of them contains > parameters, report all groups that contain parameters through one combined > warning. Though I'm going to remove this part, achieving some agreement on this algorithm may help me understand the problem better. > right? (In this case q is implicated by q2 so it gets added to the list, but > q2 itself isn't unsafe or implicated by anything, so it doesn't need fixing, > but connecting it to p would cause it to be fixed unnecessarily) Right. the code in this patch does not form `ParmsNeedFix` correctly. `ParmsNeedFix` should be the set of parameters that are either unsafe or being implicated (transitively) by unsafe variables. Then the group would be "correct" I think. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D153059/new/ https://reviews.llvm.org/D153059 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits