[PATCH] D72705: [analyzer] Added new checker 'alpha.unix.ErrorReturn'.

Kristóf Umann via Phabricator via cfe-commits Wed, 26 Aug 2020 04:11:11 -0700

Szelethus requested changes to this revision.
Szelethus added a subscriber: bruntib.
Szelethus added a comment.
This revision now requires changes to proceed.

I debated this 
<https://codechecker-demo.eastus.cloudapp.azure.com/Default/report-detail?run=emacs_errorreturn&review-status=Unreviewed&review-status=Confirmed%20bug&detection-status=New&detection-status=Reopened&detection-status=Unresolved&is-unique=on&diff-type=New&checker-name=alpha.unix.ErrorReturn&report-hash=24c61769a1ba68dddd98b4611c1132bd&report-id=7060>
 report with @bruntib, depending from where you look at it, its either a false 
positive or not -- in short, this is what happens:

  char *p = /* some assumed to be valid c string */
  end = strchr(p, '\n');
  if (end - p > 4) { // <-- warning here on end's invalid use
  }

the guarded code wouldn't be executed, should `end` be zero. Now, if we 
interpret the rule in a very literal sense, that

> The successful completion or failure of each of the standard library 
> functions listed in the following table shall be determined either by 
> comparing the function’s return value with the value listed in the column 
> labeled “Error Return” or by calling one of the library functions mentioned 
> in the footnotes.

then indeed, it doesn't compare the return value with the item in the table, 
but it does compare it with superset of it. Now, we can either say that

1. A superset isn't good enough, because its a sign of code smell, or
2. Judge that the user probably did handle the error case, albeit in an 
unconventional way, and report like these don't provide any value.

I don't think we can realistically pursue point (2.). We would need to widen 
the context (by going further up the AST with `Stmt::getParent()`) we're 
considering for a potential check. However, forming a non-checking expression 
that we don't immediately report can quickly get out of control:

  char *p = /* some assumed to be valid c string */;
  end = strchr(p, '\n');
  int result = end - p;
  if (result > 4) {
  }

You would be right to believe, without context from the code that is yet to be 
executed, that the computation of `result` is invalid, because `end` is 
unchecked. However, the code complies with the rule interpreted by (2.). This 
seems to be a problem that can have infinite complexity without very severe 
restriction on the analysis. Indeed, if we wonder too far from the original 
function call (by creating values that descend from the return value to be 
checked), being sure that no checks happened (by point 2.) quickly becomes 
unmanageable.

So, for all practical purposes, we're locked to option (1.). However, the 
warning message this way is imprecise -- we're **not** enforcing whether the 
error value was checked for error, we're enforcing whether the return value was 
checked against its respective failure value, and the message should reflect 
that. In fact, the checker description, and some comments should be clarified 
in this regard as well.

I would imagine the pool of functions that deserve such incredibly strict 
enforcement is rather small. Time management, maybe, but certainly not string 
modeling functions, because problems related to them are solved by a completely 
different checker infrastructure (`CStringChecker`), just as stream modeling 
functions and memory management. Maybe we should only look for statement-local 
bugs (basically a discarded return value) for functions that don't demand this 
rigidness, but I suspect there is a clang-tidy check already out there doing 
it. Seems like I got what @NoQ meant here.

In D72705#2088319 <https://reviews.llvm.org/D72705#2088319>, @NoQ wrote:

> I still don't understand how do we enable this checker, i.e. for whom and how 
> often. Is there anything we'll be able to turn on by default, like maybe warn 
> on all functions that wear `__attribute__((warn_unused_result))`?

---

I'm very happy we got another round of evaluations, we've gotten some 
invaluable information out of it. For future patches, it would be great to have 
your input on at least some of the reports as you're publishing them. For the 
latest project, I think there is clear intent from the developer to avoid null 
pointer misuses -- you should justify why a report is still appropriate there.

You mentioned in D72705#2187556 <https://reviews.llvm.org/D72705#2187556> that 
on `duckdb` you stumbled across 1-2 false positives, but also that you found 23 
reports, and that you ran the checker on `vim`. This may be okay for an alpha 
checker, or maybe not, if it points to a fundamental problem in your approach. 
Please publish them before updating the patch. If all goes well and those cases 
could be patched up later, here are the next important steps:

- Have a warning message that precisely states what, and how the rule was 
violated, even if its done with placeholders, even if you only wanna place a 
FIXME and address this later. Either is fine. By the time we ship this checker, 
a message that delivers the following information, even if phrased differently, 
would be great: "According to the CERT rule ERR33-C, the <return value/output 
parameter> must be explicitly checked whether its <equal/greater/less than> 
<error value>."
- Expand on the checker description similarly. Mention that checking against 
the superset of the error value isn't sufficient.

================
Comment at: clang/include/clang/StaticAnalyzer/Checkers/Checkers.td:513

 let ParentPackage = UnixAlpha in {

----------------
I'm fine with the checker being here, but the nature of the problem makes it 
definitely a good candidate to place it in an `optin` package. However, recent 
discussion about the changing of the package system, and that we're still in 
alpha, this isn't an issue to address immediate.

================
Comment at: clang/lib/StaticAnalyzer/Checkers/ErrorReturnChecker.cpp:189-190
+  bool EOFFound = false;
+  const llvm::APSInt *ValueToTestAgainst =
+      ErrorReturnChecker::getKnownConstantVal(C, OtherSideExpr);
+  if (ValueToTestAgainst) {
----------------
[[ 
https://codechecker-demo.eastus.cloudapp.azure.com/Default/report-detail?run=emacs_errorreturn&review-status=Unreviewed&review-status=Confirmed%20bug&detection-status=New&detection-status=Reopened&detection-status=Unresolved&is-unique=on&diff-type=New&checker-name=alpha.unix.ErrorReturn&report-hash=827f0d872db9555a61b7c8c4b6bb0477&report-id=7344
 | This ]] report made me imagine a case where the value we're checking against 
is the appropriate error value, but is only representable by an `SVal`. We 
could say that according to the rule the error value should be hardcoded, so 
this might just be okay, but I'd keep an eye on it (maybe with a `NOTE:`?).

================
Comment at: clang/lib/StaticAnalyzer/Checkers/ErrorReturnChecker.cpp:197-202
+    // If a function is called at the other side of comparison, assume that it
+    // returns a correct NULL or EOF value. For example: 'X == getEOFValue()'.
+    // It is not assumable that a NULL or EOF is obtained using any other
+    // expression.
+    NullFound = isa<CallExpr>(OtherSideExpr);
+    EOFFound = NullFound;
----------------
We talked about this during a meeting, but [[ 
https://codechecker-demo.eastus.cloudapp.azure.com/Default/report-detail?run=emacs_errorreturn&review-status=Unreviewed&review-status=Confirmed%20bug&detection-status=New&detection-status=Reopened&detection-status=Unresolved&is-unique=on&diff-type=New&checker-name=alpha.unix.ErrorReturn&report-hash=d1073c90fb1280a3a8b6076e67207f5d&report-id=7264
 | this ]] is interesting. Seems to be directly related to the highlighted code 
here -- we were not able to prove that the value we're testing against is the 
appropriate error value, so we guess (except if its a function call) that it 
isn't. Normally, I'd say that if we didn't manage to prove that the other value 
is incorrect, we should conservatively suppress the report, but according to 
point (1.) I just talked about, we could say that the rule enforces a hardcoded 
NULL.

At the end of the day, your evaluations on `strchr` are great for discussion 
purposes to see whether your infrastructure works, but reports on it might be 
too verbose and not valuable enough to ship into the real world.

================
Comment at: clang/lib/StaticAnalyzer/Checkers/ErrorReturnChecker.cpp:398
+          *BT_UncheckedUse, BT_UncheckedUse->getDescription(), N);
+      Report->addRange(CallLocation);
+      // FIXME: Provide information about source of CallSym.
----------------
I don't think this adds any value, and in fact makes reading the bug report 
harder, because the range is outside of the actual error location. This should 
be replaced (and not accompanied) with a `NoteTag`.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72705/new/

https://reviews.llvm.org/D72705

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D72705: [analyzer] Added new checker 'alpha.unix.ErrorReturn'.

Reply via email to