On Tue, 7 May 2019, Jakub Jelinek wrote: > On Tue, May 07, 2019 at 01:50:23PM +0200, Marc Glisse wrote: > > > And actually it seems that we could optimize the plus1 == plus2 cases > > > even if HONOR_SIGN_DEPENDENT_ROUNDING (type), because even in fesetenv > > > (FE_DOWNWARD) mode the testcase prints the first two (in all other modes > > > all > > > 4). > > > > It is very hard to judge what is ok with -frounding-math, because that mode > > is already unusably broken (I use a pass-through asm volatile to protect the > > arguments and result of every operation instead). One important aspect of > > the optimization is whether both operations use the same rounding mode, or > > if there may be a call to fesetround in between. Probably we shouldn't care > > about -frounding-mode, since anyway it is likely that it will use some > > IFN_FANCY_PLUS instead of PLUS_EXPR if it is ever implemented. > > I haven't thought about > t = x + 0.0; > fesetround (...); > y = t + 0.0; > indeed, let's take -frounding-math out of the patch now. If we improve > that mode, such as through explicit dependencies on the floating point state > in the IL, we can get back to this case too. > > > > + (inner_op @0 @1)))))))) > > > > Shouldn't you give it a name in the source pattern and return that, instead > > of creating a new statement? Or are you doing the operation a second time on > > Good idea. > > > purpose in case the rounding mode changed or to force an exception? > > > > > + (outer_op @0 @2) > > > > With sNaN, this may raise a second exception where we used to have only > > qNaN+0, no? And the handling of exceptions may have changed in between, etc. > > IEEE 754 I believe says that for x non-zero x + (+/-0.0) = x and the only > exception raised could be invalid exception if x is sNaN or the Intel > denormal operand exception (I think we generally don't care about that one) > and nothing else (there should be no overflow nor underflow nor inexact and > obviously no division by zero). If the invalid exception is masked off, > then I believe one can't distinguish between the x + 0.0 and (x + 0.0) + 0.0 > computations, already x + 0.0 will raise IE and turn the sNaN into qNaN and > the optional second + 0.0 will just keep that to be a qNaN without further > exceptions, unless there is some library call in between which queries the > accumulated exceptions, clears it etc. I believe handling that case right > is only possible if we make those dependencies in the IL explicit and under > non-default flags. In any case, I don't see a difference between the > @3 case where we keep the inner op and the case where we keep the outer op > but remove the inner op. Both behave the same. > > Here is an updated patch with your @3 idea and taking out -frounding-math > stuff.
OK if there are no further comments. Richard. > 2019-05-07 Jakub Jelinek <ja...@redhat.com> > > PR tree-optimization/90356 > * match.pd ((X +/- 0.0) +/- 0.0): Optimize into X +/- 0.0 if possible. > > * gcc.dg/tree-ssa/pr90356-1.c: New test. > * gcc.dg/tree-ssa/pr90356-2.c: New test. > * gcc.dg/tree-ssa/pr90356-3.c: New test. > * gcc.dg/tree-ssa/pr90356-4.c: New test. > > --- gcc/match.pd.jj 2019-05-07 13:56:53.062954181 +0200 > +++ gcc/match.pd 2019-05-07 14:30:36.010474285 +0200 > @@ -152,6 +152,28 @@ (define_operator_list COND_TERNARY > (if (fold_real_zero_addition_p (type, @1, 1)) > (non_lvalue @0))) > > +/* Even if the fold_real_zero_addition_p can't simplify X + 0.0 > + into X, we can optimize (X + 0.0) + 0.0 or (X + 0.0) - 0.0 > + or (X - 0.0) + 0.0 into X + 0.0 and (X - 0.0) - 0.0 into X - 0.0 > + if not -frounding-math. For sNaNs the first operation would raise > + exceptions but turn the result into qNan, so the second operation > + would not raise it. */ > +(for inner_op (plus minus) > + (for outer_op (plus minus) > + (simplify > + (outer_op (inner_op@3 @0 REAL_CST@1) REAL_CST@2) > + (if (real_zerop (@1) > + && real_zerop (@2) > + && !HONOR_SIGN_DEPENDENT_ROUNDING (type)) > + (with { bool inner_plus = ((inner_op == PLUS_EXPR) > + ^ REAL_VALUE_MINUS_ZERO (TREE_REAL_CST (@1))); > + bool outer_plus > + = ((outer_op == PLUS_EXPR) > + ^ REAL_VALUE_MINUS_ZERO (TREE_REAL_CST (@2))); } > + (if (outer_plus && !inner_plus) > + (outer_op @0 @2) > + @3)))))) > + > /* Simplify x - x. > This is unsafe for certain floats even in non-IEEE formats. > In IEEE, it is unsafe because it does wrong for NaNs. > --- gcc/testsuite/gcc.dg/tree-ssa/pr90356-1.c.jj 2019-05-07 > 14:27:17.912654939 +0200 > +++ gcc/testsuite/gcc.dg/tree-ssa/pr90356-1.c 2019-05-07 14:27:17.912654939 > +0200 > @@ -0,0 +1,23 @@ > +/* PR tree-optimization/90356 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fno-rounding-math -fsignaling-nans -fsigned-zeros > -fdump-tree-optimized" } */ > +/* { dg-final { scan-tree-dump-times "x_\[0-9]*.D. \\+ 0.0;" 12 "optimized" > } } */ > +/* { dg-final { scan-tree-dump-times "y_\[0-9]*.D. - 0.0;" 4 "optimized" } } > */ > +/* { dg-final { scan-tree-dump-times " \[+-] 0.0;" 16 "optimized" } } */ > + > +double f1 (double x) { return (x + 0.0) + 0.0; } > +double f2 (double y) { return (y + (-0.0)) + (-0.0); } > +double f3 (double y) { return (y - 0.0) - 0.0; } > +double f4 (double x) { return (x - (-0.0)) - (-0.0); } > +double f5 (double x) { return (x + 0.0) - 0.0; } > +double f6 (double x) { return (x + (-0.0)) - (-0.0); } > +double f7 (double x) { return (x - 0.0) + 0.0; } > +double f8 (double x) { return (x - (-0.0)) + (-0.0); } > +double f9 (double x) { double t = x + 0.0; return t + 0.0; } > +double f10 (double y) { double t = y + (-0.0); return t + (-0.0); } > +double f11 (double y) { double t = y - 0.0; return t - 0.0; } > +double f12 (double x) { double t = x - (-0.0); return t - (-0.0); } > +double f13 (double x) { double t = x + 0.0; return t - 0.0; } > +double f14 (double x) { double t = x + (-0.0); return t - (-0.0); } > +double f15 (double x) { double t = x - 0.0; return t + 0.0; } > +double f16 (double x) { double t = x - (-0.0); return t + (-0.0); } > --- gcc/testsuite/gcc.dg/tree-ssa/pr90356-2.c.jj 2019-05-07 > 14:27:17.912654939 +0200 > +++ gcc/testsuite/gcc.dg/tree-ssa/pr90356-2.c 2019-05-07 14:27:17.912654939 > +0200 > @@ -0,0 +1,8 @@ > +/* PR tree-optimization/90356 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fno-rounding-math -fno-signaling-nans -fsigned-zeros > -fdump-tree-optimized" } */ > +/* { dg-final { scan-tree-dump-times "x_\[0-9]*.D. \\+ 0.0;" 12 "optimized" > } } */ > +/* { dg-final { scan-tree-dump-times "y_\[0-9]*.D. - 0.0;" 0 "optimized" } } > */ > +/* { dg-final { scan-tree-dump-times " \[+-] 0.0;" 12 "optimized" } } */ > + > +#include "pr90356-1.c" > --- gcc/testsuite/gcc.dg/tree-ssa/pr90356-3.c.jj 2019-05-07 > 14:27:17.913654923 +0200 > +++ gcc/testsuite/gcc.dg/tree-ssa/pr90356-3.c 2019-05-07 14:27:17.913654923 > +0200 > @@ -0,0 +1,6 @@ > +/* PR tree-optimization/90356 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -frounding-math -fsignaling-nans -fsigned-zeros > -fdump-tree-optimized" } */ > +/* { dg-final { scan-tree-dump-times " \[+-] 0.0;" 32 "optimized" } } */ > + > +#include "pr90356-1.c" > --- gcc/testsuite/gcc.dg/tree-ssa/pr90356-4.c.jj 2019-05-07 > 14:27:17.913654923 +0200 > +++ gcc/testsuite/gcc.dg/tree-ssa/pr90356-4.c 2019-05-07 14:27:17.913654923 > +0200 > @@ -0,0 +1,6 @@ > +/* PR tree-optimization/90356 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -frounding-math -fno-signaling-nans -fsigned-zeros > -fdump-tree-optimized" } */ > +/* { dg-final { scan-tree-dump-times " \[+-] 0.0;" 32 "optimized" } } */ > + > +#include "pr90356-1.c" > > > Jakub > -- Richard Biener <rguent...@suse.de> SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)