I have been kicking these sorts of ideas around ever since I came to understand 
that
the second "UNKNOWN" in the for loop that originally started this was due to 
the state
merge as we loop. For now, and I don't mean this disrespectfully because it is 
very
hard to get right, the whole issue of merging has basically been punted, given 
some
of the simple examples we found that will merge as an unknown svalue. As you 
think
about this issue, "scope creep" becomes a concern quickly. It  quickly turns 
into
a halting problem of sorts, you have to decide how much of you want the 
analyzer to
be able to "understand" a program. For example, any human can follow this:

sum = 0;
for (idx = 1; idx <= 10; idx++) sum += idx;
__analyzer_eval (sum == 55);

but from an analyzer perspective it opens up all sorts of questions and
becomes a bit of a PhD thesis as to where you draw the line. The biggest concern
with the analyzer seems to be vulnerabilities, so I doubt it is useful to get 
the
analyzer to produce the correct answer for the above code, although it might be
interesting to do so from an academic perspective.

The example you provided gives a concrete reason that overflow should not be a
complete "punt" and I like it. In the interest of fighting scope creep and 
keeping
things manageable, I would question whether you want to actually track the 
overflow /
no overflow cases separately or just raise any possible overflow as an error 
immediately.
I am not disputing your idea, I would prefer to track the overflow and get
a correct result (in this case, an under allocation of memory). I guess I would 
want
to know how much work you think that will be. You still know the codebase a lot 
better
than I do.

My devil's advocate position would be if the analyzer raises exception on
any possible overflow, will that overwhelm the user with false positives? I
am not sure of the answer here, because a piece of me feels that overflow is not
something that production code should be relying on in any serious application,
and so should be non existent, but I am not sure if that is reflective of
reality.

The simplest way to handle your example would be the following:

struct foo * make_arr (size_t n)
{
  if (MAX_INT / sizeof (struct foo) >= n)
    return NULL;
  //continue with what you wrote
}

This should add a constraint that downstream of the initial guard, n is small
enough to prevent overflow (I would have to check, but the current analyzer
should be close to doing this if not already correct). Therefore, all we would 
need
would be the check at the definition of sz as to whether it overflows, and that
check should come back negative in my example, unknown in yours. Assuming we 
agree
that the purpose of the analyzer is to prevent vulnerabilities, and not to 
provide
an academic exercise in seeing how close we get to solving the halting problem,
we could just treat any possible overflow as an error.

As this was fundamentally your project, I don't want to tell you how to do it, 
and
the nerd in me wants to build an analyzer that can answer all the silly puzzle 
code
I can think to feed it, but from a utilitarian perspective and given everyone's
limited time, I thought I would offer this as a path you can try.

Reply via email to