Re: [PATCH] Fold (x + 0.0) + 0.0 to x + 0.0 (PR tree-optimization/90356)

Richard Biener Tue, 07 May 2019 00:49:24 -0700

On Tue, 7 May 2019, Jakub Jelinek wrote:

> Hi!
> 
> fold_real_zero_addition_p will fold x + (-0.0) or x - 0.0 to x
> when not -frounding-math, but not the rest of the options when
> -fsigned-zeros, and not when -fsignaling-nans.
> If we have (x + 0.0) + 0.0, we can fold that to just x + 0.0 even
> when honoring signed zeros, and IMNSHO even when honoring sNaNs,
> of course unless -frounding-math, then we can't do anything.
> For x other than 0.0, -0.0 and sNaN it is obviously correct, for sNaN
> sNaN + 0.0 will raise an exception and turn the result into qNaN, which
> will not raise further exception on the second addition, so IMHO it is ok
> too (unless we want to say special case -fnon-call-exceptions and the
> exception handler changing the result back to sNaN and expecting yet another
> exception).  For 0.0/-0.0 if we can assume rounding other than towards
> negative infinity, the results are:
>   x                         x
> (0.0 + 0.0) + 0.0 = 0.0 = (0.0 + 0.0)
> (-0.0 + 0.0) + 0.0 = 0.0 = (-0.0 + 0.0)
> (0.0 - 0.0) - 0.0 = 0.0 = (0.0 - 0.0)
> (-0.0 - 0.0) - 0.0 = -0.0 = (-0.0 - 0.0)
> (0.0 + 0.0) - 0.0 = 0.0 = (0.0 + 0.0)
> (-0.0 + 0.0) - 0.0 = 0.0 = (-0.0 + 0.0)
> For the above ones, the two operations are always equal to the inner operation
> (0.0 - 0.0) + 0.0 = 0.0 = 0.0 + 0.0
> (-0.0 - 0.0) + 0.0 = 0.0 = -0.0 + 0.0
> For the above cases, the two operations are always equal to the outer 
> operation
> 
> If it is y + (-0.0), it is equivalent to y - 0.0 and if it is y - (-0.0),
> it is equivalent to y + 0.0 in the above.
> 
> For rounding towards negative infinity, 0.0 - 0.0 is -0.0 rather than 0.0
> and so some of the above equivalencies are not true.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2019-05-07  Jakub Jelinek  <ja...@redhat.com>
> 
>       PR tree-optimization/90356
>       * match.pd ((X +/- 0.0) +/- 0.0): Optimize into X +/- 0.0 if possible.
> 
>       * gcc.dg/tree-ssa/pr90356-1.c: New test.
>       * gcc.dg/tree-ssa/pr90356-2.c: New test.
>       * gcc.dg/tree-ssa/pr90356-3.c: New test.
>       * gcc.dg/tree-ssa/pr90356-4.c: New test.
> 
> --- gcc/match.pd.jj   2019-05-03 15:22:07.370401908 +0200
> +++ gcc/match.pd      2019-05-06 11:26:04.701663020 +0200
> @@ -152,6 +152,28 @@ (define_operator_list COND_TERNARY
>   (if (fold_real_zero_addition_p (type, @1, 1))
>    (non_lvalue @0)))
>  
> +/* Even if the fold_real_zero_addition_p can't simplify X + 0.0
> +   into X, we can optimize (X + 0.0) + 0.0 or (X + 0.0) - 0.0
> +   or (X - 0.0) + 0.0 into X + 0.0 and (X - 0.0) - 0.0 into X - 0.0
> +   if not -frounding-math.  For sNaNs the first operation would raise
> +   exceptions but turn the result into qNan, so the second operation
> +   would not raise it.   */
> +(for inner_op (plus minus)
> + (for outer_op (plus minus)
> +  (simplify
> +   (outer_op (inner_op @0 real_zerop@1) real_zerop@2)
> +    (if (TREE_CODE (@1) == REAL_CST
> +      && TREE_CODE (@2) == REAL_CST


Will leave the "correctness check" for other folks but the above is
better written as

+   (outer_op (inner_op @0 REAL_CST@1) REAL_CST@2)
+    (if (real_zerop (@1)
+         && real_zerop (@2)

because that gets code-generated better.  Btw, for -fsignalling-nans
can we have a literal sNaN?  Then you need :c on the inner_op since
I'm not sure we canonicalize to sNaN + 0.0 rather than 0.0 + sNaN.
Maybe not worth optimizing though (since we rule out -frounding-math
a similar case there doesn't need to be considered).

> +      && HONOR_SIGNED_ZEROS (element_mode (type))
> +      && !HONOR_SIGN_DEPENDENT_ROUNDING (element_mode (type)))

You can write HONOR_SIGNED_ZEROS (type) here for brevity.

> +     (with { bool plus1 = ((inner_op == PLUS_EXPR)
> +                        ^ REAL_VALUE_MINUS_ZERO (TREE_REAL_CST (@1)));
> +          bool plus2 = ((outer_op == PLUS_EXPR)
> +                        ^ REAL_VALUE_MINUS_ZERO (TREE_REAL_CST (@2))); }
> +      (if (plus2 && !plus1)
> +       (outer_op @0 @2)
> +       (inner_op @0 @1)))))))
> +
>  /* Simplify x - x.
>     This is unsafe for certain floats even in non-IEEE formats.
>     In IEEE, it is unsafe because it does wrong for NaNs.
> --- gcc/testsuite/gcc.dg/tree-ssa/pr90356-1.c.jj      2019-05-06 
> 11:39:58.998288472 +0200
> +++ gcc/testsuite/gcc.dg/tree-ssa/pr90356-1.c 2019-05-06 11:42:53.597489688 
> +0200
> @@ -0,0 +1,23 @@
> +/* PR tree-optimization/90356 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fno-rounding-math -fsignaling-nans -fsigned-zeros 
> -fdump-tree-optimized" } */
> +/* { dg-final { scan-tree-dump-times "x_\[0-9]*.D. \\+ 0.0;" 12 "optimized" 
> } } */
> +/* { dg-final { scan-tree-dump-times "y_\[0-9]*.D. - 0.0;" 4 "optimized" } } 
> */
> +/* { dg-final { scan-tree-dump-times " \[+-] 0.0;" 16 "optimized" } } */
> +
> +double f1 (double x) { return (x + 0.0) + 0.0; }
> +double f2 (double y) { return (y + (-0.0)) + (-0.0); }
> +double f3 (double y) { return (y - 0.0) - 0.0; }
> +double f4 (double x) { return (x - (-0.0)) - (-0.0); }
> +double f5 (double x) { return (x + 0.0) - 0.0; }
> +double f6 (double x) { return (x + (-0.0)) - (-0.0); }
> +double f7 (double x) { return (x - 0.0) + 0.0; }
> +double f8 (double x) { return (x - (-0.0)) + (-0.0); }
> +double f9 (double x) { double t = x + 0.0; return t + 0.0; }
> +double f10 (double y) { double t = y + (-0.0); return t + (-0.0); }
> +double f11 (double y) { double t = y - 0.0; return t - 0.0; }
> +double f12 (double x) { double t = x - (-0.0); return t - (-0.0); }
> +double f13 (double x) { double t = x + 0.0; return t - 0.0; }
> +double f14 (double x) { double t = x + (-0.0); return t - (-0.0); }
> +double f15 (double x) { double t = x - 0.0; return t + 0.0; }
> +double f16 (double x) { double t = x - (-0.0); return t + (-0.0); }
> --- gcc/testsuite/gcc.dg/tree-ssa/pr90356-2.c.jj      2019-05-06 
> 11:43:07.232271129 +0200
> +++ gcc/testsuite/gcc.dg/tree-ssa/pr90356-2.c 2019-05-06 11:45:41.145803937 
> +0200
> @@ -0,0 +1,8 @@
> +/* PR tree-optimization/90356 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fno-rounding-math -fno-signaling-nans -fsigned-zeros 
> -fdump-tree-optimized" } */
> +/* { dg-final { scan-tree-dump-times "x_\[0-9]*.D. \\+ 0.0;" 12 "optimized" 
> } } */
> +/* { dg-final { scan-tree-dump-times "y_\[0-9]*.D. - 0.0;" 0 "optimized" } } 
> */
> +/* { dg-final { scan-tree-dump-times " \[+-] 0.0;" 12 "optimized" } } */
> +
> +#include "pr90356-1.c"
> --- gcc/testsuite/gcc.dg/tree-ssa/pr90356-3.c.jj      2019-05-06 
> 11:45:05.056382441 +0200
> +++ gcc/testsuite/gcc.dg/tree-ssa/pr90356-3.c 2019-05-06 11:47:19.779222871 
> +0200
> @@ -0,0 +1,6 @@
> +/* PR tree-optimization/90356 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -frounding-math -fsignaling-nans -fsigned-zeros 
> -fdump-tree-optimized" } */
> +/* { dg-final { scan-tree-dump-times " \[+-] 0.0;" 32 "optimized" } } */
> +
> +#include "pr90356-1.c"
> --- gcc/testsuite/gcc.dg/tree-ssa/pr90356-4.c.jj      2019-05-06 
> 11:46:02.140467400 +0200
> +++ gcc/testsuite/gcc.dg/tree-ssa/pr90356-4.c 2019-05-06 11:47:28.175088284 
> +0200
> @@ -0,0 +1,6 @@
> +/* PR tree-optimization/90356 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -frounding-math -fno-signaling-nans -fsigned-zeros 
> -fdump-tree-optimized" } */
> +/* { dg-final { scan-tree-dump-times " \[+-] 0.0;" 32 "optimized" } } */
> +
> +#include "pr90356-1.c"
> 
>       Jakub
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)

Re: [PATCH] Fold (x + 0.0) + 0.0 to x + 0.0 (PR tree-optimization/90356)

Reply via email to