On 01/09/2017 09:18 AM, Martin Sebor wrote:
On 01/08/2017 09:44 PM, Jeff Law wrote:
On 12/23/2016 02:25 PM, Martin Sebor wrote:
Bug 78703 points out that the decimal point character in floating
directives can be longer than just one byte (in locales where the
decimal point is a multibyte character).  The decimal point can
result in anywhere between 1 and MB_LEN_MAX bytes.  This is unlikely
but locales with two-byte decimal point are known to exist, and
the gimple-ssa-sprintf pass must handle them correctly.

In a comment on the bug Jakub suggests that while printf return
value optimization must correctly deal with the worst case (i.e.,
MB_LEN_MAX of 6 for UTF-8), reflecting the worst case in the text
of warnings could be confusing to users most of whom expect
a single byte decimal point.

Finally, a limitation of the gimple-ssa-sprintf pass has been that
it only understands constant width and precision and treats others
as essentially unlimited even if they are constrained to a limited
range of values.  This results in false positives and negatives
that can be avoided.

The attached patch enhances the pass to overcome both of these
limitations.  It does that by first replacing the exact byte counter
with two other counters: 1) a likely counter that tracks the number
of bytes a directive is likely to result in, and 2) an "unlikely"
byte for lack of a better name, that tracks the unlikely maximum
byte count in cases like multibyte decimal point, and second by
adding range handling for width and precision specified by the
asterisk (such as in sprintf("%*.*i", w, p, i)).

The patch resulted in more extensive changes than I initially
intended but the result is a simplified implementation.  A good
amount of the changes is factoring code out into more general
functions that can be shared throughout the pass.

With these enhancements, although the support for ranges in the
pass is complete, it's not as robust as it could be.  I think
having the pass run later could improve things.

The pass does produce a fair number of warnings for calls to
snprintf in the linux kernel.  Some of these I suspect will be
considered false positives.  I think it might be worth splitting
up the snprintf warning from -Wformat-length and adding a separate
option to control it.

Martin

gcc-78703.diff


PR middle-end/78703 -  -fprintf-return-value floating point handling
incorrect in locales with a mulltibyte decimal point

gcc/ChangeLog:

    PR middle-end/78703
    * gimple-ssa-sprintf.c (get_int_range): New function.
    (struct result_range): Add members.
    (struct format_result): Replace number_chars, number_chars_min, and
    number_chars_max, with struct result_ramge.  Remove constant.
    (format_result::operator+=): Update and define out of class.
    (struct fmtresult): Add constructors.  Remove constant and bounded
    members.
    (format_result::type_max_digits): New function.
    (format_result::adjust_for_width_and_precision): New function.
    (struct conversion_spec): Rename...
    (struct directive): ...to this.
    (struct directive): Add new data members.
    (directive::set_width, directive::set_precison): New functions.
    (bytes_remaining, get_int_range, format_character, format_plain):
Same.
    (should_warn_p, maybe_warn, parse_directive): Same.
    (min_bytes_remaining, add_bytes): Remove.
    (format_percent, get_string_length): Simplify.
    (format_integer): Handle width and precision ranges.
    (format_floating): Same.
    (get_mpfr_format_length): Work around MPFR bugs and simplify.
    (format_string): Factor single character handling into
    format_character.  Handle width and precision ranges.
    (format_directive): Factor out most warning code into maybe_warn.
    (pass_sprintf_length::compute_format_length): Factor out parsing
    into parse_directive.
    (try_substitute_return_value): Handle unlikely maximum byte counter.
    Simplify for better clarity.

gcc/testsuite/ChangeLog:

    PR middle-end/78703
    * gcc.dg/tree-ssa/builtin-sprintf-2.c: Adjust.
    * gcc.dg/tree-ssa/builtin-sprintf-5.c: Same.
    * gcc.dg/tree-ssa/builtin-sprintf-warn-1.c: Same.
    * gcc.dg/tree-ssa/builtin-sprintf-warn-2.c: Same.
    * gcc.dg/tree-ssa/builtin-sprintf-warn-3.c: Same.
    * gcc.dg/tree-ssa/builtin-sprintf-warn-4.c: Same.
    * gcc.dg/tree-ssa/builtin-sprintf-warn-7.c: Same.
    * gcc.dg/tree-ssa/builtin-sprintf-warn-9.c: New test.
    * gcc.dg/tree-ssa/builtin-sprintf.c: Adjust.
    * gcc.dg/format/pr78569.c: Same.


+  /* Get the real type format desription for the target.  */
s/desription/description/




@@ -1613,24 +1738,23 @@ get_string_length (tree str)

   if (lenrange [0] || lenrange [1])
     {
-      fmtresult res;
+      fmtresult res (tree_fits_uhwi_p (lenrange[0])
+             ? tree_to_uhwi (lenrange[0]) : 1 < warn_format_length,
Ordering of operands.



+      else if (0 < min && min < 128)
Order of operands.  It would be helpful if you could just do an
iteration over the patch for these issues.


+      if (0 <= dir.prec[1]
+          && (unsigned HOST_WIDE_INT)dir.prec[1] < res.range.max)
Order of operands.

+      if (0 <= dir.prec[0])
Order of operands.




-      if (res.range.max < (unsigned HOST_WIDE_INT)width)
-    res.range.max = width;
+  if (1 == warn_format_length
Order of operands here.



@@ -1843,27 +2276,28 @@ format_string (const conversion_spec &spec,
tree arg)
@@ -1889,6 +2318,7 @@ format_directive (const
pass_sprintf_length::call_info &info,
      they refer to.  */
   res->knownrange &= fmtres.knownrange;

+#if 1
Presumably you should just get rid of the cpp directives.




-
   /* Compute the number of available bytes in the destination.  There
      must always be at least one byte of space for the terminating
      NUL that's appended after the format string has been
processed.  */
-  unsigned HOST_WIDE_INT navail = min_bytes_remaining (info.objsize,
*res);
+  // unsigned HOST_WIDE_INT navail = min_bytes_remaining
(info.objsize, *res);
+  result_range avail_range = bytes_remaining (info.objsize, *res);
Remove the commented out line.


+  /* Clear UNDER4K in the overall result if the maximum has exceeded
+     the 4k (this is necessary to avoid the return valuye optimization
s/valuye/value/

+  if (!warned
+      /* Only warn at level 2.  */
+      && 1 < warn_format_length
Order of operands.


+  else if ('*' == *pf)
Order of operands.


As much as I hate to say it, I think we need to find a way to break this
down into something more manageable.  It's just impossible to see the
structure of what you're doing with the implementation changes mixed in
with what appear to be refactoring changes.

Is there any way we can start to break this down into something more
manageable?

I'm sorry it's too big.  I actually did start out by doing in stages
and and have a smaller initial patch that the rest of it builds on.
I unfortunately forgot that I had done that when I posted the bigger
patch and then left for my trip.

Attached are the two revisions. The description of the major changes
in each of the two is below.  The patches are bigger in part because
of the name change (conversion_spec to directive and spec to dir),
but also because the range work provided an opportunity to factor
out and consolidate a good amount of code.  IMO, the end result is
quite a bit cleaner than before and should be easier to follow.

I'm not sure how difficult or time consuming it would be to split
up the second of the two patches if it's still too big.  I suspect
it would take a few days.

gcc-78703-1.diff

  Extends struct conversion_spec to struct directive so that ordinary
  format characters can be handled uniformly with other directives
  (those that start with a '%') and adds format_plain to format them.

  It also adds parse_directive and moves directive parsing from
  pass_sprintf_length::compute_format_length to it.

  It makes no changes to the warning or optimization logic and so
  the test updates only to reflect the uniform directive handling.

gcc-78703-2.diff

  Contains the bulk of the rest of the changes.  I.e., it replaces
  the superfluous exact byte counter with the likely counter and
  adds the unlikely counter.

  It replaces the get_width_and_precision function with get_int_range
  to get ranges for width and precision ranges and adds
  fmtresult::adjust_for_width_or_precision to handle with and precision
  ranges.

  It adds tge should_warn_p function and moves all logic to determine
  whether or not to issue a warning at a given point in the processing
  of a format string to it.

  It adds the maybe_warn function and ymoves all the format directive
  overflow and truncation warnings into it.

  It removes the add_bytes function (its fully superseded by
  format_directive.

  Finally, this version also enhances the debugging output to include
  width and precision ranges.
I'm still having trouble wrapping my head around 78703-2.  It's just huge.

So the most obvious thing I see to break out is the should_warn_p changes. If maybe_warn could be a distinct patch too, that would help. It may not seem like much, but it really does help. There may be more that sticks out after that.



jeff

Reply via email to