[PATCH] D77792: [analyzer] Extend constraint manager to be able to compare simple SymSymExprs

Artem Dergachev via Phabricator via cfe-commits Mon, 21 Sep 2020 17:50:46 -0700

NoQ added inline comments.

================
Comment at: 
clang/include/clang/StaticAnalyzer/Core/PathSensitive/RangedConstraintManager.h:103-110
+  enum class CompareResult {
+    unknown,
+    identical,
+    less,
+    less_equal,
+    greater,
+    greater_equal
----------------
baloghadamsoftware wrote:
> ASDenysPetrov wrote:
> > Can we use `Optional<BinaryOperatorKind>` instead, to reduce similar enums? 
> > Or you want to separate the meaning in a such way?
> Good question. @NoQ?
The meaning is obviously completely different even when the names actually 
match. `BO_LT` is not a "result", it's the operation itself. It gets even worse 
for other opcodes (what kind of comparison result is `BO_PtrMemI`???).

The idea to re-use the enum is noble but we definitely need to find some other 
enum.

================
Comment at: clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:814

+Optional<ProgramStateRef> RangeConstraintManager::tryAssumeSymSymOp(
+    ProgramStateRef State, BinaryOperator::Opcode Op, SymbolRef LHSSym,
----------------
baloghadamsoftware wrote:
> NoQ wrote:
> > I believe you don't need to return an optional here. The method simply 
> > acknowledges any assumptions it could make in the existing state and 
> > returns the updated state. Therefore, if it wasn't able to record any 
> > assumptions, it returns the existing state. Because the only reasonable 
> > behavior the caller could implement when the current implementation returns 
> > `None` is to roll back to the existing state anyway.
> Actually, `tryAssumeSymSymOp()` does not assume anything and therefore it 
> does not return a new state. What it actually returns is a ternary logical 
> value: the assumption is either valid, invalid or we cannot reason about it. 
> Maybe Optional<bool> would be better here and we should also chose a less 
> misleading name, because it does not "try to assume" anything, but tries to 
> reason about the existing assumption.
I dislike this design because it turns the code into a spaghetti in which every 
facility in the constraint manager has to be aware of any other facility in the 
constraint manager and agree on who's responsible for what. It's too easy to 
miss something and missing something isn't as bad as doing double work.

Like, should different facilities in the constraint manager try to record as 
*much* information in the program state ("fat constraints") as possible or as 
*little* as possible ("thin constraints")? Eg., in this example, if we know 
that the range for `$x` is strictly below the range for `$y`, should we also 
add the opaque constraint `$x < $y` to the program state, or should we instead 
teach every facility to infer that `$x < $y` by looking at their ranges?

I feel we should go with fat constraints.

1. Con: They're fat. They eat more memory, cause more immutable tree 
rebalances, etc.
2. Con: It's easy to accidentally make two similar states look different. Eg., 
`{ $x: [1, 2], $y: [3, 4] }` and `{ $x : [1, 2], $y: [3, 4], $x < $y: true }` 
are the same state but they won't be deduplicated and if they appear on 
different paths at the same program point these paths won't get merged.
3. Pro: They minimize the amount of false positives by giving every facility in 
the constraint manager as much data as possible to conclude that the state is 
infeasible.
4. Pro: As i said above, they're easier to implement and to get right. In case 
of thin constraints, every facility has to actively let other facilites know 
that they don't need to handle the assumption anymore because this facility has 
"consumed" it. In our example it means returning an `Optional<ProgramStateRef>` 
which would be empty if the constraint wasn't consumed and needs to be handled 
by another facility (namely, `assumeSymUnsupported`). In case of fat 
constraints you simply add your information and you don't care if other 
facilities ever read it or not.

Con 1. should be dismissed as a premature optimization: we can always add some 
thinness to fat constraints if we have a proof that their fatness is an actual 
problem. Con 2. is a real issue but it's a fairly minor issue; path merges are 
fairly rare anyway; mergeability is a nice goal to pursue but not at the cost 
of having false positives. So i think pros outweight the cons here.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77792/new/

https://reviews.llvm.org/D77792

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D77792: [analyzer] Extend constraint manager to be able to compare simple SymSymExprs

Reply via email to