Re: [PATCH] Fold (x + 0.0) + 0.0 to x + 0.0 (PR tree-optimization/90356)

Richard Biener Tue, 07 May 2019 06:05:17 -0700

On Tue, 7 May 2019, Jakub Jelinek wrote:

> On Tue, May 07, 2019 at 01:50:23PM +0200, Marc Glisse wrote:
> > > And actually it seems that we could optimize the plus1 == plus2 cases
> > > even if HONOR_SIGN_DEPENDENT_ROUNDING (type), because even in fesetenv
> > > (FE_DOWNWARD) mode the testcase prints the first two (in all other modes 
> > > all
> > > 4).
> > 
> > It is very hard to judge what is ok with -frounding-math, because that mode
> > is already unusably broken (I use a pass-through asm volatile to protect the
> > arguments and result of every operation instead). One important aspect of
> > the optimization is whether both operations use the same rounding mode, or
> > if there may be a call to fesetround in between. Probably we shouldn't care
> > about -frounding-mode, since anyway it is likely that it will use some
> > IFN_FANCY_PLUS instead of PLUS_EXPR if it is ever implemented.
> 
> I haven't thought about
>  t = x + 0.0;
>  fesetround (...);
>  y = t + 0.0;
> indeed, let's take -frounding-math out of the patch now.  If we improve
> that mode, such as through explicit dependencies on the floating point state
> in the IL, we can get back to this case too.
> 
> > > + (inner_op @0 @1))))))))
> > 
> > Shouldn't you give it a name in the source pattern and return that, instead
> > of creating a new statement? Or are you doing the operation a second time on
> 
> Good idea.
> 
> > purpose in case the rounding mode changed or to force an exception?
> > 
> > > + (outer_op @0 @2)
> > 
> > With sNaN, this may raise a second exception where we used to have only
> > qNaN+0, no? And the handling of exceptions may have changed in between, etc.
> 
> IEEE 754 I believe says that for x non-zero x + (+/-0.0) = x and the only
> exception raised could be invalid exception if x is sNaN or the Intel
> denormal operand exception (I think we generally don't care about that one)
> and nothing else (there should be no overflow nor underflow nor inexact and
> obviously no division by zero).  If the invalid exception is masked off,
> then I believe one can't distinguish between the x + 0.0 and (x + 0.0) + 0.0
> computations, already x + 0.0 will raise IE and turn the sNaN into qNaN and
> the optional second + 0.0 will just keep that to be a qNaN without further
> exceptions, unless there is some library call in between which queries the
> accumulated exceptions, clears it etc.  I believe handling that case right
> is only possible if we make those dependencies in the IL explicit and under
> non-default flags.  In any case, I don't see a difference between the
> @3 case where we keep the inner op and the case where we keep the outer op
> but remove the inner op.  Both behave the same.
> 
> Here is an updated patch with your @3 idea and taking out -frounding-math
> stuff.


OK if there are no further comments.

Richard.

> 2019-05-07  Jakub Jelinek  <ja...@redhat.com>
> 
>       PR tree-optimization/90356
>       * match.pd ((X +/- 0.0) +/- 0.0): Optimize into X +/- 0.0 if possible.
> 
>       * gcc.dg/tree-ssa/pr90356-1.c: New test.
>       * gcc.dg/tree-ssa/pr90356-2.c: New test.
>       * gcc.dg/tree-ssa/pr90356-3.c: New test.
>       * gcc.dg/tree-ssa/pr90356-4.c: New test.
> 
> --- gcc/match.pd.jj   2019-05-07 13:56:53.062954181 +0200
> +++ gcc/match.pd      2019-05-07 14:30:36.010474285 +0200
> @@ -152,6 +152,28 @@ (define_operator_list COND_TERNARY
>   (if (fold_real_zero_addition_p (type, @1, 1))
>    (non_lvalue @0)))
>  
> +/* Even if the fold_real_zero_addition_p can't simplify X + 0.0
> +   into X, we can optimize (X + 0.0) + 0.0 or (X + 0.0) - 0.0
> +   or (X - 0.0) + 0.0 into X + 0.0 and (X - 0.0) - 0.0 into X - 0.0
> +   if not -frounding-math.  For sNaNs the first operation would raise
> +   exceptions but turn the result into qNan, so the second operation
> +   would not raise it.   */
> +(for inner_op (plus minus)
> + (for outer_op (plus minus)
> +  (simplify
> +   (outer_op (inner_op@3 @0 REAL_CST@1) REAL_CST@2)
> +    (if (real_zerop (@1)
> +      && real_zerop (@2)
> +      && !HONOR_SIGN_DEPENDENT_ROUNDING (type))
> +     (with { bool inner_plus = ((inner_op == PLUS_EXPR)
> +                             ^ REAL_VALUE_MINUS_ZERO (TREE_REAL_CST (@1)));
> +          bool outer_plus
> +            = ((outer_op == PLUS_EXPR)
> +               ^ REAL_VALUE_MINUS_ZERO (TREE_REAL_CST (@2))); }
> +      (if (outer_plus && !inner_plus)
> +       (outer_op @0 @2)
> +       @3))))))
> +
>  /* Simplify x - x.
>     This is unsafe for certain floats even in non-IEEE formats.
>     In IEEE, it is unsafe because it does wrong for NaNs.
> --- gcc/testsuite/gcc.dg/tree-ssa/pr90356-1.c.jj      2019-05-07 
> 14:27:17.912654939 +0200
> +++ gcc/testsuite/gcc.dg/tree-ssa/pr90356-1.c 2019-05-07 14:27:17.912654939 
> +0200
> @@ -0,0 +1,23 @@
> +/* PR tree-optimization/90356 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fno-rounding-math -fsignaling-nans -fsigned-zeros 
> -fdump-tree-optimized" } */
> +/* { dg-final { scan-tree-dump-times "x_\[0-9]*.D. \\+ 0.0;" 12 "optimized" 
> } } */
> +/* { dg-final { scan-tree-dump-times "y_\[0-9]*.D. - 0.0;" 4 "optimized" } } 
> */
> +/* { dg-final { scan-tree-dump-times " \[+-] 0.0;" 16 "optimized" } } */
> +
> +double f1 (double x) { return (x + 0.0) + 0.0; }
> +double f2 (double y) { return (y + (-0.0)) + (-0.0); }
> +double f3 (double y) { return (y - 0.0) - 0.0; }
> +double f4 (double x) { return (x - (-0.0)) - (-0.0); }
> +double f5 (double x) { return (x + 0.0) - 0.0; }
> +double f6 (double x) { return (x + (-0.0)) - (-0.0); }
> +double f7 (double x) { return (x - 0.0) + 0.0; }
> +double f8 (double x) { return (x - (-0.0)) + (-0.0); }
> +double f9 (double x) { double t = x + 0.0; return t + 0.0; }
> +double f10 (double y) { double t = y + (-0.0); return t + (-0.0); }
> +double f11 (double y) { double t = y - 0.0; return t - 0.0; }
> +double f12 (double x) { double t = x - (-0.0); return t - (-0.0); }
> +double f13 (double x) { double t = x + 0.0; return t - 0.0; }
> +double f14 (double x) { double t = x + (-0.0); return t - (-0.0); }
> +double f15 (double x) { double t = x - 0.0; return t + 0.0; }
> +double f16 (double x) { double t = x - (-0.0); return t + (-0.0); }
> --- gcc/testsuite/gcc.dg/tree-ssa/pr90356-2.c.jj      2019-05-07 
> 14:27:17.912654939 +0200
> +++ gcc/testsuite/gcc.dg/tree-ssa/pr90356-2.c 2019-05-07 14:27:17.912654939 
> +0200
> @@ -0,0 +1,8 @@
> +/* PR tree-optimization/90356 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fno-rounding-math -fno-signaling-nans -fsigned-zeros 
> -fdump-tree-optimized" } */
> +/* { dg-final { scan-tree-dump-times "x_\[0-9]*.D. \\+ 0.0;" 12 "optimized" 
> } } */
> +/* { dg-final { scan-tree-dump-times "y_\[0-9]*.D. - 0.0;" 0 "optimized" } } 
> */
> +/* { dg-final { scan-tree-dump-times " \[+-] 0.0;" 12 "optimized" } } */
> +
> +#include "pr90356-1.c"
> --- gcc/testsuite/gcc.dg/tree-ssa/pr90356-3.c.jj      2019-05-07 
> 14:27:17.913654923 +0200
> +++ gcc/testsuite/gcc.dg/tree-ssa/pr90356-3.c 2019-05-07 14:27:17.913654923 
> +0200
> @@ -0,0 +1,6 @@
> +/* PR tree-optimization/90356 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -frounding-math -fsignaling-nans -fsigned-zeros 
> -fdump-tree-optimized" } */
> +/* { dg-final { scan-tree-dump-times " \[+-] 0.0;" 32 "optimized" } } */
> +
> +#include "pr90356-1.c"
> --- gcc/testsuite/gcc.dg/tree-ssa/pr90356-4.c.jj      2019-05-07 
> 14:27:17.913654923 +0200
> +++ gcc/testsuite/gcc.dg/tree-ssa/pr90356-4.c 2019-05-07 14:27:17.913654923 
> +0200
> @@ -0,0 +1,6 @@
> +/* PR tree-optimization/90356 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -frounding-math -fno-signaling-nans -fsigned-zeros 
> -fdump-tree-optimized" } */
> +/* { dg-final { scan-tree-dump-times " \[+-] 0.0;" 32 "optimized" } } */
> +
> +#include "pr90356-1.c"
> 
> 
>       Jakub
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)

Re: [PATCH] Fold (x + 0.0) + 0.0 to x + 0.0 (PR tree-optimization/90356)

Reply via email to