Re: [PATCH] better handling of ranges (PR 78703)

Jeff Law Sun, 08 Jan 2017 20:44:23 -0800

On 12/23/2016 02:25 PM, Martin Sebor wrote:

Bug 78703 points out that the decimal point character in floating
directives can be longer than just one byte (in locales where the
decimal point is a multibyte character).  The decimal point can
result in anywhere between 1 and MB_LEN_MAX bytes.  This is unlikely
but locales with two-byte decimal point are known to exist, and
the gimple-ssa-sprintf pass must handle them correctly.


In a comment on the bug Jakub suggests that while printf return
value optimization must correctly deal with the worst case (i.e.,
MB_LEN_MAX of 6 for UTF-8), reflecting the worst case in the text
of warnings could be confusing to users most of whom expect
a single byte decimal point.

Finally, a limitation of the gimple-ssa-sprintf pass has been that
it only understands constant width and precision and treats others
as essentially unlimited even if they are constrained to a limited
range of values.  This results in false positives and negatives
that can be avoided.

The attached patch enhances the pass to overcome both of these
limitations.  It does that by first replacing the exact byte counter
with two other counters: 1) a likely counter that tracks the number
of bytes a directive is likely to result in, and 2) an "unlikely"
byte for lack of a better name, that tracks the unlikely maximum
byte count in cases like multibyte decimal point, and second by
adding range handling for width and precision specified by the
asterisk (such as in sprintf("%*.*i", w, p, i)).

The patch resulted in more extensive changes than I initially
intended but the result is a simplified implementation.  A good
amount of the changes is factoring code out into more general
functions that can be shared throughout the pass.

With these enhancements, although the support for ranges in the
pass is complete, it's not as robust as it could be.  I think
having the pass run later could improve things.

The pass does produce a fair number of warnings for calls to
snprintf in the linux kernel.  Some of these I suspect will be
considered false positives.  I think it might be worth splitting
up the snprintf warning from -Wformat-length and adding a separate
option to control it.

Martin

gcc-78703.diff


PR middle-end/78703 -  -fprintf-return-value floating point handling incorrect 
in locales with a mulltibyte decimal point

gcc/ChangeLog:

        PR middle-end/78703
        * gimple-ssa-sprintf.c (get_int_range): New function.
        (struct result_range): Add members.
        (struct format_result): Replace number_chars, number_chars_min, and
        number_chars_max, with struct result_ramge.  Remove constant.
        (format_result::operator+=): Update and define out of class.
        (struct fmtresult): Add constructors.  Remove constant and bounded
        members.
        (format_result::type_max_digits): New function.
        (format_result::adjust_for_width_and_precision): New function.
        (struct conversion_spec): Rename...
        (struct directive): ...to this.
        (struct directive): Add new data members.
        (directive::set_width, directive::set_precison): New functions.
        (bytes_remaining, get_int_range, format_character, format_plain): Same.
        (should_warn_p, maybe_warn, parse_directive): Same.
        (min_bytes_remaining, add_bytes): Remove.
        (format_percent, get_string_length): Simplify.
        (format_integer): Handle width and precision ranges.
        (format_floating): Same.
        (get_mpfr_format_length): Work around MPFR bugs and simplify.
        (format_string): Factor single character handling into
        format_character.  Handle width and precision ranges.
        (format_directive): Factor out most warning code into maybe_warn.
        (pass_sprintf_length::compute_format_length): Factor out parsing
        into parse_directive.
        (try_substitute_return_value): Handle unlikely maximum byte counter.
        Simplify for better clarity.

gcc/testsuite/ChangeLog:

        PR middle-end/78703
        * gcc.dg/tree-ssa/builtin-sprintf-2.c: Adjust.
        * gcc.dg/tree-ssa/builtin-sprintf-5.c: Same.
        * gcc.dg/tree-ssa/builtin-sprintf-warn-1.c: Same.
        * gcc.dg/tree-ssa/builtin-sprintf-warn-2.c: Same.
        * gcc.dg/tree-ssa/builtin-sprintf-warn-3.c: Same.
        * gcc.dg/tree-ssa/builtin-sprintf-warn-4.c: Same.
        * gcc.dg/tree-ssa/builtin-sprintf-warn-7.c: Same.
        * gcc.dg/tree-ssa/builtin-sprintf-warn-9.c: New test.
        * gcc.dg/tree-ssa/builtin-sprintf.c: Adjust.
        * gcc.dg/format/pr78569.c: Same.

+  /* Get the real type format desription for the target.  */

s/desription/description/

@@ -1613,24 +1738,23 @@ get_string_length (tree str)

   if (lenrange [0] || lenrange [1])
     {
-      fmtresult res;
+      fmtresult res (tree_fits_uhwi_p (lenrange[0])
+                    ? tree_to_uhwi (lenrange[0]) : 1 < warn_format_length,

Ordering of operands.



+         else if (0 < min && min < 128)

Order of operands. It would be helpful if you could just do aniteration over the patch for these issues.

+         if (0 <= dir.prec[1]
+             && (unsigned HOST_WIDE_INT)dir.prec[1] < res.range.max)

Order of operands.

+      if (0 <= dir.prec[0])

Order of operands.

-      if (res.range.max < (unsigned HOST_WIDE_INT)width)
-       res.range.max = width;
+  if (1 == warn_format_length

Order of operands here.

@@ -1843,27 +2276,28 @@ format_string (const conversion_spec &spec, tree arg)
@@ -1889,6 +2318,7 @@ format_directive (const pass_sprintf_length::call_info 
&info,
      they refer to.  */
   res->knownrange &= fmtres.knownrange;

+#if 1

Presumably you should just get rid of the cpp directives.

-
   /* Compute the number of available bytes in the destination.  There
      must always be at least one byte of space for the terminating
      NUL that's appended after the format string has been processed.  */
-  unsigned HOST_WIDE_INT navail = min_bytes_remaining (info.objsize, *res);
+  // unsigned HOST_WIDE_INT navail = min_bytes_remaining (info.objsize, *res);
+  result_range avail_range = bytes_remaining (info.objsize, *res);

Remove the commented out line.

+  /* Clear UNDER4K in the overall result if the maximum has exceeded
+     the 4k (this is necessary to avoid the return valuye optimization

s/valuye/value/

+  if (!warned
+      /* Only warn at level 2.  */
+      && 1 < warn_format_length

Order of operands.

+  else if ('*' == *pf)

Order of operands.

As much as I hate to say it, I think we need to find a way to break thisdown into something more manageable. It's just impossible to see thestructure of what you're doing with the implementation changes mixed inwith what appear to be refactoring changes.

Is there any way we can start to break this down into something moremanageable?


Jeff

Re: [PATCH] better handling of ranges (PR 78703)

Reply via email to