NoQ created this revision.
NoQ added reviewers: dcoughlin, xazax.hun, a_sidorin, rnkovacs, Szelethus, 
baloghadamsoftware, Charusso.
Herald added subscribers: cfe-commits, dkrupp, donat.nagy, mikhail.ramalho, 
a.sidorin, szepet.
Herald added a project: clang.

This is the bug that I'll be using in the LLVM Developer's Meeting presentation 
as an example of how to debug things, so //spoiler alert!//

And let me explain it a bit slower than usual.

Here's the original false positive that we're fixing:

F10304929: bad.html <https://reviews.llvm.org/F10304929>

It has appeared by applying the Static Analyzer to LLVM itself. Namely, you 
should be able to reproduce it locally by applying the static analyzer to LLVM 
rL373385 <https://reviews.llvm.org/rL373385>; the static analyzer itself should 
be one commit behind //this// commit.

The warning tells us that variable `Reg` on line 824 is uninitialized:

F10304965: Screen Shot 2019-10-17 at 6.18.06 PM.png 
<https://reviews.llvm.org/F10304965>

This, however, is clearly a false positive, because the lambda invocation 
`isKilledReg(MO, Reg)` receives a non-const reference to `Reg` and initializes 
it on the current path:

F10304987: Screen Shot 2019-10-17 at 6.19.32 PM.png 
<https://reviews.llvm.org/F10304987>

The static analyzer somehow fails to realize that `Reg` is initialized, and 
even displays a clearly incorrect note (41) "Returning without writing to 
'Reg'".

The variable `Reg` itself is declared in the caller function, 
`transferSpillOrRestoreInst`, and passed into the current function by 
reference, which is also named `Reg`:

F10305034: Screen Shot 2019-10-17 at 6.22.45 PM.png 
<https://reviews.llvm.org/F10305034>

F10305046: Screen Shot 2019-10-17 at 6.24.04 PM.png 
<https://reviews.llvm.org/F10305046>

In order to debug the issue, I dumped the trimmed exploded graph for the 
current analysis and used exploded-graph-rewriter to remove the unnecessary 
details:

  $ clang ... -analyze-function="(anonymous 
namespace)::LiveDebugValue::runOnMachineFunction(class llvm::MachineFunction 
&)" -analyzer-dump-egraph=graph.dot -trim-egraph
  
  $ exploded-graph-rewriter.py --topology graph.dot
  
  # I searched the topology dump and found out that our bug node is 52419.
  $ exploded-graph-rewriter.py --to 52419 --single-path --diff graph.dot

Here's the final graph dump:

F10305539: tmpruvE5U.html <https://reviews.llvm.org/F10305539>

I tried to find out which transition in the graph was incorrect.

First of all I confirmed that there is no binding in the Store for the `Reg` in 
the bug state. The //referece// `Reg` was there and was known to refer to the 
//variable// `Reg`, but the variable itself is indeed nowhere to be found, as 
if it was never written to:

F10305142: Screen Shot 2019-10-17 at 6.35.33 PM.png 
<https://reviews.llvm.org/F10305142>

This gave me confidence that the checker is not to be blamed: the variable was 
"known" to be uninitialized, therefore the checker was right to report it, and 
we need to find out why was the variable incorrectly known to be uninitialized.

Therefore, logically, the next question was why did the assignment on line 812, 
before step 41, was not recorded in the Store. Here's the ExplodedNode in which 
the assignment operator was evaluated:

F10305210: Screen Shot 2019-10-17 at 6.42.28 PM.png 
<https://reviews.llvm.org/F10305210>

Here we can see that the assignment was in fact recorded in the Store, but in a 
wrong memory region. In our notation, `SymRegion{reg_$3519<unsigned int & 
Reg>}` is an //unknown// region of memory that is the pointee of the reference 
`Reg` points to.

However, given that `Reg` is a parameter in a function call into which we dived 
during analysis, this symbolic value should not have existed to begin with, 
because we have concrete knowledge about what `Reg` points to. Instead, we 
should have had a binding in the Store from `Reg` (the memory region of the 
reference parameter) to `&Reg` (the pointer to the variable `Reg`), and loading 
from `Reg` (as part of evaluating which location should be written into) should 
have yielded `&Reg` rather than `&SymRegion{reg_$3519<unsigned int & Reg>}`.

In order to confirm this educated guess, I compared the evaluation that the 
Static Analyzer did for call sites of `isLocationSpill()`:

F10305268: Screen Shot 2019-10-17 at 6.56.29 PM.png 
<https://reviews.llvm.org/F10305268>

and `isKilledReg()`:

F10305274: Screen Shot 2019-10-17 at 6.57.33 PM.png 
<https://reviews.llvm.org/F10305274>

And, indeed, the binding `Reg : &Reg` is supposed to be there in both cases, 
but it is missing in case of `isKilledReg()`. Interestingly, the binding for 
the other parameter, `MO`, is also missing in case of `isKilledReg`.

By scrolling up a few nodes we can see that the bindings for //expressions// 
`Reg` and `MO` are present in the Environment when the call is being entered:

F10305300: Screen Shot 2019-10-17 at 7.01.05 PM.png 
<https://reviews.llvm.org/F10305300>

In other words, the translation of parameter values from the Environment to the 
Store was not performed correctly during call enter, even though the necessary 
information seems to have been present in the program state. Therefore this 
transition was the primary suspect.

Then I attached debugger to the transition. Even though there are many call 
enters that were evaluated throughout this analysis, I can attach the debugger 
precisely with the help of the stable IDs:

  (lldb) br s -n inlineCall -c 'Pred->Id = 52357'

Here `inlineCall` is a method of `ExprEngine` that is responsible for "diving" 
into calls during analysis. Because the `ExprEngine` class is responsible for 
evaluating the effects of specific statements (and other CFG elements) over the 
program state, any transition can be debugged by setting a similar breakpoint 
on a method of this class that looks relevant.

Step-by-step debugging led me into function `addParameterValuesToBindings()` in 
`CallEvent.cpp`, which is indeed the function responsible for the translation 
of parameter values from the Environment to the Store:

  528     SVal ArgVal = Call.getArgSVal(Idx);
  529     if (!ArgVal.isUnknown()) {
  530       Loc ParamLoc = SVB.makeLoc(MRMgr.getVarRegion(ParamDecl, 
CalleeCtx));
  531       Bindings.push_back(std::make_pair(ParamLoc, ArgVal));
  532     }

This transition seems straightforward, however there is a caveat - in C++ the 
transition is skipped for parameters that were constructed by invoking a 
constructor:

  522       if (Call.isArgumentConstructedDirectly(Idx))
  523         continue;

Indeed, if a constructor was invoked, then the Store bindings should already be 
there by the time the call is entered, so there is no need to manually bind 
them.

We do in fact have an argument that is constructed directly in this example, 
which is `MO` - a C++ object of class `MachineOperand` that gets passed into 
`isKilledReg` by value:

F10305470: Screen Shot 2019-10-17 at 7.24.42 PM.png 
<https://reviews.llvm.org/F10305470>

By the way, you can learn more about how constructors are modeled in a talk by 
@george.karpenkov and me in LLVM DevMtg 2018: 
https://www.youtube.com/watch?v=4n3l-ZcDJNY

The problem, however, was that `isArgumentConstructedDirectly(Idx)` returned 
`false` for `MO` but `true` for `Reg`! It should clearly be the other way round.

This was happening due to an off-by-one error caused by how in Clang AST for 
historical numbering of parameters in member operator declarations is different 
by 1 from the numbering of arguments in member operator call expressions: one 
of them counts "this", the other doesn't. I was already aware about this 
problem, which is why in D49443 <https://reviews.llvm.org/D49443> I added 
convenient accessor methods for dealing with these mismatched offsets. However, 
I forgot to use them in this tiny code snippet. Hence the patch.

Thanks for reading!


Repository:
  rC Clang

https://reviews.llvm.org/D69155

Files:
  clang/lib/StaticAnalyzer/Core/CallEvent.cpp
  clang/test/Analysis/temporaries.cpp


Index: clang/test/Analysis/temporaries.cpp
===================================================================
--- clang/test/Analysis/temporaries.cpp
+++ clang/test/Analysis/temporaries.cpp
@@ -1231,3 +1231,19 @@
   return coin ? S() : foo(); // no-warning
 }
 } // namespace return_from_top_frame
+
+#if __cplusplus >= 201103L
+namespace arguments_of_operators {
+struct S {
+  S() {}
+  S(const S &) {}
+};
+
+void test() {
+  int x = 0;
+  auto foo = [](S s, int &y) { y = 1; };
+  foo(S(), x);
+  clang_analyzer_eval(x == 1); // expected-warning{{TRUE}}
+}
+} // namespace arguments_of_operators
+#endif // __cplusplus >= 201103L
Index: clang/lib/StaticAnalyzer/Core/CallEvent.cpp
===================================================================
--- clang/lib/StaticAnalyzer/Core/CallEvent.cpp
+++ clang/lib/StaticAnalyzer/Core/CallEvent.cpp
@@ -519,7 +519,7 @@
 
     // TODO: Support allocator calls.
     if (Call.getKind() != CE_CXXAllocator)
-      if (Call.isArgumentConstructedDirectly(Idx))
+      if (Call.isArgumentConstructedDirectly(Call.getASTArgumentIndex(Idx)))
         continue;
 
     // TODO: Allocators should receive the correct size and possibly alignment,


Index: clang/test/Analysis/temporaries.cpp
===================================================================
--- clang/test/Analysis/temporaries.cpp
+++ clang/test/Analysis/temporaries.cpp
@@ -1231,3 +1231,19 @@
   return coin ? S() : foo(); // no-warning
 }
 } // namespace return_from_top_frame
+
+#if __cplusplus >= 201103L
+namespace arguments_of_operators {
+struct S {
+  S() {}
+  S(const S &) {}
+};
+
+void test() {
+  int x = 0;
+  auto foo = [](S s, int &y) { y = 1; };
+  foo(S(), x);
+  clang_analyzer_eval(x == 1); // expected-warning{{TRUE}}
+}
+} // namespace arguments_of_operators
+#endif // __cplusplus >= 201103L
Index: clang/lib/StaticAnalyzer/Core/CallEvent.cpp
===================================================================
--- clang/lib/StaticAnalyzer/Core/CallEvent.cpp
+++ clang/lib/StaticAnalyzer/Core/CallEvent.cpp
@@ -519,7 +519,7 @@
 
     // TODO: Support allocator calls.
     if (Call.getKind() != CE_CXXAllocator)
-      if (Call.isArgumentConstructedDirectly(Idx))
+      if (Call.isArgumentConstructedDirectly(Call.getASTArgumentIndex(Idx)))
         continue;
 
     // TODO: Allocators should receive the correct size and possibly alignment,
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to