[clang] [clang] [Static analyzer]: add initial support for builtin overflow (PR #102602)

Donát Nagy via cfe-commits Fri, 09 Aug 2024 06:39:13 -0700

================
@@ -50,6 +101,44 @@ class BuiltinFunctionChecker : public Checker<eval::Call> {
 
 } // namespace
 
+void BuiltinFunctionChecker::HandleOverflowBuiltin(const CallEvent &Call,
+                                                   CheckerContext &C,
+                                                   BinaryOperator::Opcode Op,
+                                                   QualType ResultType) const {
+  // All __builtin_*_overflow functions take 3 argumets.
+  assert(Call.getNumArgs() == 3);
+
+  ProgramStateRef State = C.getState();
+  SValBuilder &SVB = C.getSValBuilder();
+  const Expr *CE = Call.getOriginExpr();
+
+  SVal Arg1 = Call.getArgSVal(0);
+  SVal Arg2 = Call.getArgSVal(1);
+
+  SVal RetVal = SVB.evalBinOp(State, Op, Arg1, Arg2, ResultType);
+
+  // TODO: Handle overflows with values that known to overflow. Like INT_MAX + 
1
+  // should not produce state for non-overflow case and threat it as
----------------
NagyDonat wrote:


I thought about this for some time and checked the implementation of 
`evalBinOp` (more precisely `SValBuilder:evalBinOpNN`, the case that's relevant 
here), but unfortunately I don't see a clear solution.

You are right that `SVB` _should_ take care of this, but unfortunately that 
does not happen currently.

Currently the analyzer (and `SValBuilder` in particular) has its own rigid 
opinion about overflow handling: it allows overflow on all numeric operations 
(including signed ones, where it's undefined behavior according to the 
standard) and models it without splitting the state, so e.g. if `x` and `y` are 
`signed char` values within the range `50 <= x, y <= 100`, then it calculates 
that the possible values of `x+y` are $[-128, -56] \cup [100, 127]$.

It would be good to enhance this behavior by tagging the ranges within the 
`RangeSet` with "did overflow happen?" bits that would let us perform a state 
split. This is theoretically doable (I don't see anything that would block 
this), but would be a complex refactoring effort (we'd need to pass along this 
"did we overflow?" bit through all the spaghetti code).

Without this, you can implement a workaround that calls 
`SValBuilder::getMinValue()` and `SValBuilder::getMaxValue()` to get the 
minimal and maximal values of the operands (as `llvm::APSInt` values, that is, 
**a**rbitrary **p**recision **s**igned **int**egers) and then uses these to 
determine which of the two branches (overflow, no-overflow) are possible.

This workaround solution would be better than the current "always act as if 
both branches were possible", but it is not sufficient for deducing the set of 
possible results on the no-overflow branch. (That is, if `x` and `y` are 
`signed char` values within the range `50 <= x, y <= 100`, then this workaround 
would create two branches: one where `__builtin_add_overflow` returns true, and 
and one where it returns `false` and stores a symbol with range $[-128, -56] 
\cup [100, 127]$ within the result pointer; while the correct assumption would 
be that the result has range $[100, 127]$ on this no-overflow branch.)

https://github.com/llvm/llvm-project/pull/102602
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] [Static analyzer]: add initial support for builtin overflow (PR #102602)

Reply via email to