NoQ added inline comments. ================ Comment at: lib/StaticAnalyzer/Checkers/ConversionChecker.cpp:84 @@ +83,3 @@ +// Can E value be greater or equal than Val? +static bool canBeGreaterEqual(CheckerContext &C, const Expr *E, + unsigned long long Val) { ---------------- danielmarjamaki wrote: > danielmarjamaki wrote: > > zaks.anna wrote: > > > danielmarjamaki wrote: > > > > zaks.anna wrote: > > > > > This function returns true if the value "is" greater or equal, not > > > > > "can be" greater or equal. The latter would be "return StGE". > > > > > > > > > > Also, it's slightly better to return the StGE state and use it to > > > > > report the bug. This way, our assumption is explicitly recorded in > > > > > the error state. > > > > NoQ made the same comment. I disagree. > > > > > > > > int A = 0; > > > > if (X) { > > > > A = 1000; > > > > } > > > > U8 = A; // <- Imho; A _can_ be 1000 > > > > > > > > Imho it's better to say that A _can_ be 1000 unless A is 1000 for all > > > > possible execution paths through the code. > > > > > > > > Do you still think "is" is better than "can be"? > > > The Clang Static Analyzer performs path sensitive analysis of the > > > program. (It does not merge the paths at the "U8 = A" statement!!!) You > > > will only be changing the state along a single execution path of this > > > program. Along that path, A will always be 1000. > > > > > > When analyzing your example, the analyzer is going to separately analyze > > > 2 paths: > > > 1st path: A=0; X != 0; A =1000; U8 = A; // Here U8 is definitely 1000. > > > 2d path: A=0; X == 0; U8 = A; // Here U8 is definitely 0. > > > > > > This video contains an intuitive explanation of symbolic execution > > > technique we use: > > > http://llvm.org/devmtg/2012-11/videos/Zaks-Rose-Checker24Hours.mp4 > > I understand that and I still think that value of A "can be" 1000. Yes in > > that path the value "is" 1000. > > > > But as far as I see, you and others disagree with me. And therefore I will > > change to "is". > For your information in Cppcheck I say that a value is "possible" if some > path(s) generates that value. And "always" when all paths generate that value. > > Code example: > > int f(int x) { > int a = 1000; > int b = 0; > if (x == 500) > a = 3; > return a + b - x; > } > > Debug output (cppcheck --debug-normal file.c): > > ##Value flow > Line 3 > 1000 always 1000 > Line 4 > 0 always 0 > Line 5 > x possible 500 > == possible 1 > 500 always 500 > Line 6 > 3 always 3 > Line 7 > a possible {1000,3} > + possible {1000,3} > b always 0 > x possible 500 > > For me personally it is confusing to say that A "is" 1000. That is different > to how I normally think of it in Cppcheck. Consider the following examples: ``` // Example 1. int a = rand(); use(a); ``` ``` // Example 2. int b; scanf("%d", &b); use(b); ``` Here `a` and `b` "can be" 1000 when passed into `use()`, however your function would return `false`, because there is no particular execution path found on which they are "certainly" 1000.
On the other hand, in the following example: ``` // Example 3. int a = rand(); if (a == 1000) { } use(a); ``` at the `use()` of `a` your function would return `false` on the path that passes through one branch and `true` on another path, however the `false` return value would not indicate that `a` "cannot be" 1000 on this line; it simply indicates that `a` is not certainly 1000 on one (but not all) of the paths. These examples show that the function returns "false" much more often than its name suggests, hence we propose a different terminology. The good name for the function would be "the value 'certainly is' [greater or equal] on this particular path, though probably not on every path". Additionally, example 2 is of particular interest: it gives an example of a "tainted" symbol which is explicitly known to take all values that fit its integral type, depending on user (cf. "attacker") input. You might (some day, no rush, i guess) like to extend your checker to consider tainted values as truly "can-be-anything" and throw warnings even without finding a particular path. For such cases, your function would actually be something like "there exists a path on which it 'certainly can be', though maybe on other paths it's not as certain". There's already a basic support for such taint analysis in the analyzer. In fact, example 1 is also a curious discussion point - even when there's no attacker to substitute your random number generator, symbols produced by `rand()` and such are also explicitly known to take all values between `0` and some kind of `RAND_MAX`, which is a sort of information that may be useful to the analyzer and can be described as a weaker kind of taint without much security implications, but still not something that can be represented as presence or lack of range constraints. http://reviews.llvm.org/D13126 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits