On 08/08/2018 11:36 PM, Jeff Law wrote:
On 08/02/2018 09:42 AM, Martin Sebor wrote:
The warning bits are definitely not okay by me. The purpose
of the warnings (-W{format,sprintf}-{overflow,truncation} is
to detect buffer overflows. When a warning doesn't have access
to string length information for dynamically created strings
(like the strlen pass does) it uses array sizes as a proxy.
This is useful both to detect possible buffer overflows and
to prevent false positives for overflows that cannot happen
in correctly written programs.
So how much of this falling-back to array sizes as a proxy would become
unnecessary if sprintf had access to the strlen pass as an analysis module?
As you know we've been kicking that around and from my investigations
that doesn't really look hard to do. Encapsulate the data structures in
a class, break up the statement handling into analysis and optimization
and we should be good to go. I'm hoping to start prototyping this week.
If we think that has a reasonable chance to eliminate the array-size
fallback, then that seems like the most promising path forward.
We discussed this idea this morning so let me respond here and
reiterate the answer. Using the strlen data will help detect
buffer overflow where the array size isn't available but it
cannot replace the array size heuristic. Here's a simple
example:
struct S { char a[8]; };
char d[8];
void f (struct S *s, int i)
{
sprintf (d, "%s-%i", s[i].a, i);
}
We don't know the length of s->a but we do know that it can
be up to 7 bytes long (assuming it's nul-terminated of course)
so we know the sprintf call can overflow. Conversely, if
the size of the destination is increased to 20 the sprintf
call cannot overflow so the diagnostic can be avoided.
Removing the array size heuristic would force us to either give
up on diagnosing the first case or issue false positives for
the second case. I think the second alternative would make
the warning too noisy to be useful.
The strlen pass will help detect buffer overflows in cases
where the array size isn't known (e.g., with dynamically
allocated buffers) but where the length of the string store
in the array is known. It will also help avoid false positives
in cases where the string stored in an array of known size is
known to be too short to cause an overflow. For instance here:
struct S { char a[8]; };
char d[8];
void f (struct S *s, int i)
{
if (strlen (s->a) < 4 && i >= 0 && i < 100)
sprintf (d, "%s-%i", s->a, i);
}
Martin