donat.nagy marked 4 inline comments as done.
donat.nagy added a comment.

Thanks for the suggestions, I answered those where I had something to say and 
I'll implement the rest of them.

In D156312#4576120 <https://reviews.llvm.org/D156312#4576120>, @steakhal wrote:

> Have you considered checking the shift operands for taint?
> I wonder if we could warn if we cannot prove the correct use of the shift 
> operator when either of the operands is tainted.

It would be straightforward to add taint handling; but I'd prefer to leave it 
for a follow-up commit. Some experimentation would be needed to see whether it 
produces useful results (do real-world projects use shifts on tainted numbers? 
do we encounter false positives?).



================
Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:53
+class BitwiseShiftValidator {
+  // Primary mutable state:
+  CheckerContext &Ctx;
----------------
steakhal wrote:
> Where does this comment corresponds to?
The two data members (`Ctx` and `State`) that are directly below the comment.

I tried to split the list of data members into three groups (primary mutable 
state, secondary mutable state, `const`  members), but now I reconsider this I 
feel that these header comments are actually superfluous.

I'll return to a variant of the earlier layout where I put 
`NonNegFlags`/`UpperBound` to the end and comment that they are only used for 
note tag creation; but don't emphasize the (immediately visible) `const`ness / 
non-`const`-ness of other members with comments.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:57-59
+  // Secondary mutable state, used only for note tag creation:
+  enum { NonNegLeft = 1, NonNegRight = 2 };
+  unsigned NonNegFlags = 0;
----------------
steakhal wrote:
> How about using the same enum type for these two? It would signify that these 
> are used together.
> Also, by only looking at the possible values, it's not apparent that these 
> are bitflag values. A suffix might help the reader.
Using the enum type is a good idea, however I'd probably put the `Flag` suffix 
into the name of the type:
```
enum SignednessFlags : unsigned { NonNegLeft = 1, NonNegRight = 2 };
SignednessFlags NonNegOperands = 0;
```


================
Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:60-63
+  // If there is an upper bound, it is <= the size of a primitive type
+  // (expressed in bits), so this is a very safe sentinel value:
+  enum { NoUpperBound = 999 };
+  unsigned UpperBound = NoUpperBound;
----------------
steakhal wrote:
> Have you considered using `std::optional`?
> Given that the dimension of this variable is "bits", I think it would be 
> great to have that in its name.
I considered using `std::optional`, but using this "old-school" solution 
ensures that `NoUpperBound` is comparable to and larger than the "real" upper 
bound values, which lets me avoid some ugly boolean logic. I'll update the 
comment and probably also the variable name.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:106-109
+  if (auto BRP = checkOvershift()) {
+    Ctx.emitReport(std::move(BRP));
+    return;
+  }
----------------
steakhal wrote:
> I'd prefer `BR` or `Report`. Since we know already that we are in the 
> path-sensitive domain, `P` has less value there.
> I'd apply this concept everywhere in the patch.
The `P` stands for the "Ptr" in the name of the type alias `BugReportPtr`; but 
your comment clearly highlights that this variable name is confusing ;).

I'll probably avoid the `auto` and use `BugReportPtr Bug = ...` to remind the 
reader that this is a pointer (that can be null) and if it's non-null, then we 
found a bug.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:190-191
+  std::string Msg =
+      llvm::formatv("The result of {0} shift is undefined because the right "
+                    "operand{1} is not smaller than {2}, the capacity of 
'{3}'",
+                    isLeftShift() ? "left" : "right", RightOpStr, LHSBitWidth,
----------------
steakhal wrote:
> 
This is an important distinction, I use "not smaller than {2}" as a shorter 
synonym of "larger than or equal to {2}". I can choose some other wording if 
you feel that this is not clear enough.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:263
+
+  const SVal Right = Ctx.getSVal(Op->getRHS());
+
----------------
steakhal wrote:
> I'd move this closer to the first "use" place.
Good point, previously it was used by the calculation that is now performed by 
`assumeRequirement`.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:315
+    default:
+      llvm_unreachable("this checker does not use other comparison operators");
+  }
----------------
steakhal wrote:
> 
I'd say that the current wording is correct because the the checker "use"s the 
comparison operators (in its implementation) instead of "accept"ing them from 
outside; but I can replace the message with e.g. "this //function// does not 
accept other comparison operators" if you'd prefer that.


================
Comment at: clang/lib/StaticAnalyzer/Checkers/BitwiseShiftChecker.cpp:301-302
+                      pluralSuffix(MaximalAllowedShift));
+    R->addNote(LeftNote, PathDiagnosticLocation{LHS, Ctx.getSourceManager(),
+                                                Ctx.getLocationContext()});
+    Ctx.emitReport(std::move(R));
----------------
NoQ wrote:
> donat.nagy wrote:
> > donat.nagy wrote:
> > > donat.nagy wrote:
> > > > gamesh411 wrote:
> > > > > donat.nagy wrote:
> > > > > > NoQ wrote:
> > > > > > > Can we just append this to the warning? The `addNote()` is useful 
> > > > > > > for notes that need to be placed in code outside of the execution 
> > > > > > > path, but if it's always next to the warning, it probably doesn't 
> > > > > > > make sense to display it separately.
> > > > > > I didn't append this to the warning because I felt that the warning 
> > > > > > text is already too long. (By the way, an earlier internal variant 
> > > > > > of this checker produced two separate notes next to the warning 
> > > > > > message.)
> > > > > > 
> > > > > > I tweaked the messages of this checker before initiating this 
> > > > > > review, but I feel that there's still some room for improvements. 
> > > > > > (E.g. in this particular case perhaps we could omit some of the 
> > > > > > details that are not immediately useful and could be manually 
> > > > > > calculated from other parts of the message...) 
> > > > > Just a thought on simplifying the diagnostics a bit:
> > > > > 
> > > > > Warning: "Right operand is negative in left shift"
> > > > > Note: "The result of left shift is undefined because the right 
> > > > > operand is negative"
> > > > > Shortened: "Undefined left shift due to negative right operand"
> > > > > 
> > > > > Warning: "Left shift by '34' overflows the capacity of 'int'"
> > > > > Note: "The result of left shift is undefined because the right 
> > > > > operand '34' is not smaller than 32, the capacity of 'int'"
> > > > > Shortened: "Undefined left shift: '34' exceeds 'int' capacity (32 
> > > > > bits)"
> > > > > 
> > > > > The more complex notes are a bit sketchy, as relevant information can 
> > > > > be lost, and the following solution is probably a bit too much, but 
> > > > > could prove to be an inspiration:
> > > > > 
> > > > > Warning: "Left shift of '1024' overflows the capacity of 'int'"
> > > > > CXX Note: "Left shift of '1024' is undefined because 'int' can hold 
> > > > > only 32 bits (including the sign bit), so some bits overflow"
> > > > > CXX Note: "The value '1024' is represented by 11 bits, allowing at 
> > > > > most 21 bits for bitshift"
> > > > > C Note: "Left shift of '1024' is undefined because 'int' can hold 
> > > > > only 31 bits (excluding the sign bit), so some bits overflow"
> > > > > C Note: "The value '1024' is represented by 11 bits, allowing at most 
> > > > > 20 bits for bitshift"
> > > > > 
> > > > > Shortened:
> > > > > CXX Warning: "Undefined left shift: '1024' (11 bits) exceeds 'int' 
> > > > > capacity (32 bits, including sign)"
> > > > > C Warning: "Undefined left shift: '1024' (11 bits) exceeds 'int' 
> > > > > capacity (31 bits, excluding sign)"
> > > > Clarification about the `Msg`/`ShortMsg` distinction:
> > > > I'm intentionally separating the short warning messages and the longer 
> > > > note messages because `createBugReport()` enforces the convention that 
> > > > it will always emit a warning and a note at the bug location.
> > > > 
> > > > According to the comments in the source code, the intention is that the 
> > > > note contains all the relevant information, while the warning is a 
> > > > brief summary that can be displayed in situations where the notes 
> > > > wouldn't fit the UI.
> > > > 
> > > > IIUC many checkers ignore this distinction and emit the same (often 
> > > > long and cumbersome) message both as a note and as a warning 
> > > > (`createBugReport()` has a variant which takes only one message 
> > > > parameter and passes it to both locations), but I tried to follow it 
> > > > because I think it's ugly when the same message is repeated twice and 
> > > > there is some sense in providing both a brief summary and a full 
> > > > description that doesn't use potentially-ambiguous abbreviations to 
> > > > save space.
> > > > 
> > > > Of course I could also accept a community decision that this "brief 
> > > > warning / full note" distinction is deprecated and will be eliminated 
> > > > (because e.g. front-end utilities are not prepared to handle it), but 
> > > > in that case I'd strongly suggest a redesign where we eliminate the 
> > > > redundantly printed 'note' message. (There is no reason to say the same 
> > > > thing twice! There is no reason to say the same thing twice!)
> > > > 
> > > > On the other hand, in addition to this `Msg`/`ShortMsg` distinction, 
> > > > this part of the code also adds the extra `LeftNote` (as a remnant from 
> > > > an earlier internal version of this checker), and that's that's what 
> > > > I'd like to merge into `Msg` (following NoQ's suggestion and keeping it 
> > > > distinct from the `ShortMsg`).
> > > Among notes, my only planned change is that instead of 
> > > 
> > > > Warning: "Left shift of '1024' overflows the capacity of 'int'"
> > > > CXX Note: "Left shift of '1024' is undefined because 'int' can hold 
> > > > only 32 bits (including the sign bit), so some bits overflow"
> > > > CXX Note: "The value '1024' is represented by 11 bits, allowing at most 
> > > > 21 bits for bitshift"
> > > > C Note: "Left shift of '1024' is undefined because 'int' can hold only 
> > > > 31 bits (excluding the sign bit), so some bits overflow"
> > > > C Note: "The value '1024' is represented by 11 bits, allowing at most 
> > > > 20 bits for bitshift"
> > > 
> > > I'll implement something like
> > > 
> > > > Warning: "Left shift of '1024' overflows the capacity of 'int'"
> > > > CXX Note: "Left shift of '1024' (represented by 11 bits) is undefined 
> > > > because 'int' can hold only 32 bits (including the sign bit), so some 
> > > > bits overflow"
> > > > C Note: "Left shift of '1024' (represented by 11 bits) is undefined 
> > > > because 'int' can hold only 31 bits (excluding the sign bit), so some 
> > > > bits overflow"
> > Correction: now I'm leaning towards just discarding the secondary note, 
> > because the message examples listed in the previous comment are just the 
> > variants where the right operand is unknown. In the more detailed message 
> > template "The shift '{0} << {1}' is undefined {2}, so {3} bit{4} 
> > overflow{5}" there is no place to insert the "represented by {} bits" 
> > message.
> > 
> There's nothing wrong with long, multi-sentence diagnostic messages!
> 
> Unlike the compiler proper, we aren't typically used from the command line, 
> so we aren't trying to fit into 80 columns. So we start our warnings with a 
> capital letter and expect them to be, at the very least, complete sentences.
Even if the message length is not limited, I think it's better to delete the 
additional note line (instead of merging it with the "regular" note line) 
because it's just //too much information// and the gist of the issue is lost 
behind the flood of redundant numbers.

However, this is just a subjective decision and I'm open to extending the 
message if you feel that something important is missing from it.


================
Comment at: clang/test/Analysis/bitwise-shift-common.c:108
+  *p += 1 << a;
+  // expected-note@-1 {{Assuming right operand of bit shift is non-negative 
but less than 32}}
+  *p += 1 << (a + 32);
----------------
steakhal wrote:
> Hmm, does this "but" actually mean "and" here?
> Anyway, both using "but" and "and" there looks awkward.
> It might be only me though.
Yes, this sentence fragment is like "not too small but not too large", where 
"but" is equivalent to "and" and both are a bit awkward.


================
Comment at: clang/test/Analysis/bitwise-shift-sanity-checks.c:20-22
+// RUN:    -Wno-shift-count-negative -Wno-shift-negative-value \
+// RUN:    -Wno-shift-count-overflow -Wno-shift-overflow \
+// RUN:    -Wno-shift-sign-overflow
----------------
steakhal wrote:
> How about simplifying these lines to `-Wno-shift`? Have you considered it?
Unfortunately there is no such shortcut: when I tried to execute trunk clang 
with it, I got "//warning: unknown warning option '-Wno-shift'//".



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156312/new/

https://reviews.llvm.org/D156312

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to