Re: [PATCH] relax -Wformat-overflow for precision ranges (PR 79275)

Martin Sebor Wed, 01 Feb 2017 16:41:03 -0800

On 01/31/2017 03:33 PM, Jeff Law wrote:

On 01/30/2017 02:28 PM, Martin Sebor wrote:

Bug 79275 - -Wformat-overflow false positive exceeding INT_MAX in
glibc sysdeps/posix/tempname.c points out a false positive found
during a Glibc build and caused by the checker using the upper
bound of a range of precisions in string directives with string
arguments of non-constant length.  The attached patch relaxes
the checker to use the lower bound instead when appropriate.


Martin

gcc-79275.diff


PR middle-end/79275 -  -Wformat-overflow false positive exceeding
INT_MAX in glibc sysdeps/posix/tempname.c

gcc/testsuite/ChangeLog:

    PR middle-end/79275
    * gcc.dg/tree-ssa/builtin-sprintf-warn-11.c: New test.
    * gcc.dg/tree-ssa/pr79275.c: New test.

gcc/ChangeLog:

    PR middle-end/79275
    * gimple-ssa-sprintf.c (get_string_length): Set lower bound to zero.
    (format_string): Tighten up the range of output for non-constant
    strings and correct the expected range for wide non-constant strings.

My general inclination is to ask this to wait for gcc-8 as it is not a
regression, but instead a false positive in a new warning.


I would feel better if the patch were committed not just because
of the false positives but also because it corrects the range for
non-constant wide strings (i.e., for something like
snprintf (0, 0, "%ls", rand () ? L"\u03a6" : L"" ) trunk thinks
the output is at most 1 byte when in reality it's more like 2
bytes (U+03a6 is "\xCE\xA6" in UTF-8).  It's probably a pretty
rare case but still.


However, if we see this triggering with any significant frequency, then
we should reevaluate.  Getting Marek's build logs and grepping through
them might guide us a bit on this...


I'm not sure what the rationale is for length of zero at level 1 and
length of one at higher levels for unknown strings.  I guess I can
kindof see the former, though I suspect if we looked at the actual
strings, zero length would actually be uncommon.

For level 1 and above assuming a single character seems way too
tolerant.  We're issuing a "may" warning, so ISTM we ought to be
assuming a much larger length here.  I realize that makes a lot more
noise for the warning, but doesn't that better reflect what may happen?


My rationale for zero and one for strings of unknown length was
to try to avoid false positives.  The checker does use the size
of the array when a string of unknown length points to one as
the likely length.  That's run into the expected pushback but
at least there is (what I consider) a defensible rationale for
it (the string could potentially be as long as the array,
otherwise why make the array that big?)  It also uses the length
of the longest of the strings an argument is known to point to,
on the same basis.

My biggest concern with being more aggressive than that (besides
the pushback) is that I can't think of a good function to compute
the size (it can't very well be a constant).

Martin

diff --git a/gcc/gimple-ssa-sprintf.c b/gcc/gimple-ssa-sprintf.c
index 8261a44..c0c0a5f 100644
--- a/gcc/gimple-ssa-sprintf.c
+++ b/gcc/gimple-ssa-sprintf.c
@@ -1986,43 +1987,89 @@ format_string (const directive &dir, tree arg)
     }
   else
     {
-      /* For a '%s' and '%ls' directive with a non-constant string,
-     the minimum number of characters is the greater of WIDTH
-     and either 0 in mode 1 or the smaller of PRECISION and 1
-     in mode 2, and the maximum is PRECISION or -1 to disable
-     tracking.  */
+      /* For a '%s' and '%ls' directive with a non-constant string
(either
+     one of a number of strings of known length or an unknown string)
+     the minimum number of characters is lesser of PRECISION[0] and
+     the length of the shortest known string or zero, and the maximum
+     is the lessser of the length of the longest known string or

s/lessser/lesser/

Jeff

Re: [PATCH] relax -Wformat-overflow for precision ranges (PR 79275)

Reply via email to