On Tue, Feb 10, 2015 at 9:55 PM, Jeff Law <l...@redhat.com> wrote: > > This PR was originally minor issue where we regressed on this kind of > sequence: > > typedef struct toto_s *toto_t; > toto_t add (toto_t a, toto_t b) { > int64_t tmp = (int64_t)(intptr_t)a + ((int64_t)(intptr_t)b&~1L); > return (toto_t)(intptr_t) tmp; > } > > > There was talk of trying to peephole this in the x86 backend. But later > Jakub speculated that if we had good type narrowing this could be done in > the tree optimizers... > > Soooo, here we go. I didn't do anything with logicals are those are already > handled elsewhere in match.pd. I didn't try to handle MULT as in the early > experiments I did, it was a lose because of the existing mechanisms for > widening multiplications. > > Interestingly enough, this patch seems to help out libjava more than > anything else in a GCC build and it really only helps a few routines. There > weren't any routines I could see where the code regressed after this patch. > This is probably an indicator that these things aren't *that* common, or the > existing shortening code better than we thought, or some important > shortening case is missing. > > > I think we should pull the other tests from 47477 which are not regressions > out into their own bug for future work. Or alternately, when this fix is > checked in remove the regression marker in 47477. > > > Bootstrapped and regression tested on x86_64-unknown-linux-gnu. OK for the > trunk? > > > > > > > > > diff --git a/gcc/ChangeLog b/gcc/ChangeLog > index 7f3816c..7a95029 100644 > --- a/gcc/ChangeLog > +++ b/gcc/ChangeLog > @@ -1,3 +1,8 @@ > +2015-02-10 Jeff Law <l...@redhat.com> > + > + * match.pd (convert (plus/minus (convert @0) (convert @1): New > + simplifier to narrow arithmetic. > +
Please reference PR47477 from here > 2015-02-10 Richard Biener <rguent...@suse.de> > > PR tree-optimization/64909 > diff --git a/gcc/match.pd b/gcc/match.pd > index 81c4ee6..abc703e 100644 > --- a/gcc/match.pd > +++ b/gcc/match.pd > @@ -1018,3 +1018,21 @@ along with GCC; see the file COPYING3. If not see > (logs (pows @0 @1)) > (mult @1 (logs @0))))) > > +/* If we have a narrowing conversion of an arithmetic operation where > + both operands are widening conversions from the same type as the outer > + narrowing conversion. Then convert the innermost operands to a suitable > + unsigned type (to avoid introducing undefined behaviour), perform the > + operation and convert the result to the desired type. > + > + This narrows the arithmetic operation. */ This is also a part of what shorten_binary_op does, so please refer to that in the comment so we can eventually complete this from there and remove shorten_binary_op. > +(for op (plus minus) > + (simplify > + (convert (op (convert@2 @0) (convert @1))) > + (if (TREE_TYPE (@0) == TREE_TYPE (@1) > + && TREE_TYPE (@0) == type Otherwhere in match.pd we use (GIMPLE && types_compatible_p (TREE_TYPE (@0), TREE_TYPE (@1))) || (GENERIC && TREE_TYPE (@0) == TREE_TYPE (@1)) for type equality checks. And I think even for GENERIC you want to ignore qualifiers and thus use TYPE_MAIN_VARIANT (TREE_TYPE (@0)) == TYPE_MAIN_VARAINT (TREE_TYPE (@1)). > + && INTEGRAL_TYPE_P (type) So instead of testing TREE_TYPE (@0) == type we probably want to just assert that TYPE_PRECISON (TREE_TYPE (@0)) == TYPE_PRECISION (type) to allow sign-changes. Say for short i, j; (unsigned short) (i + j) we still want to narrow the widened i and j. > + && TYPE_PRECISION (TREE_TYPE (@2)) > TYPE_PRECISION (TREE_TYPE > (@0)) > + /* This prevents infinite recursion. */ > + && unsigned_type_for (TREE_TYPE (@0)) != TREE_TYPE (@2)) How can it recurse with the type precision constraint right above? If you think it's necessary then nesting that condition below as > + (with { tree utype = unsigned_type_for (TREE_TYPE (@0)); } (if (utype != TREE_TYPE (@2)) avoids evaluating unsigned_type_for twice. Note that technically we don't need to perform the operation in an unsigned type iff TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)). Thus (if (TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)) (convert (op @0 @1))) (with { tree utype = unsigned_type_for (TREE_TYPE (@0)); } (convert (op (convert:utype @0) (convert:utype @1)))))) You can see that GCC exploits that with -fwrapv. Maybe this simplification should be split out into a separate pattern though, tacking sign-changes for binary ops only. Thanks, Richard. > + (convert (op (convert:utype @0) (convert:utype @1))))))) > diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog > index 15d5e2d..76e5254 100644 > --- a/gcc/testsuite/ChangeLog > +++ b/gcc/testsuite/ChangeLog > @@ -1,3 +1,8 @@ > +2015-02-10 Jeff Law <l...@redhat.com> > + > + PR rtl-optimization/47477 > + * gcc.dg/tree-ssa/narrow-arith-1.c: New test. > + > 2015-02-10 Richard Biener <rguent...@suse.de> > > PR tree-optimization/64909 > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/narrow-arith-1.c > b/gcc/testsuite/gcc.dg/tree-ssa/narrow-arith-1.c > new file mode 100644 > index 0000000..104cb6f5 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/tree-ssa/narrow-arith-1.c > @@ -0,0 +1,22 @@ > +/* PR tree-optimization/47477 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-optimized -w" } */ > +/* { dg-require-effective-target ilp32 } */ > + > +typedef int int64_t __attribute__ ((__mode__ (__DI__))); > +typedef int * intptr_t; > + > +typedef struct toto_s *toto_t; > +toto_t add (toto_t a, toto_t b) { > + int64_t tmp = (int64_t)(intptr_t)a + ((int64_t)(intptr_t)b&~1L); > + return (toto_t)(intptr_t) tmp; > +} > + > +/* For an ILP32 target there'll be 6 casts when we start, but just 4 > + if the match.pd pattern is successfully matched. */ > +/* { dg-final { scan-tree-dump-times "= \\(int\\)" 1 "optimized" } } */ > +/* { dg-final { scan-tree-dump-times "= \\(unsigned int\\)" 2 "optimized" } > } */ > +/* { dg-final { scan-tree-dump-times "= \\(struct toto_s \\*\\)" 1 > "optimized" } } */ > +/* { dg-final { cleanup-tree-dump "optimized" } } */ > + > + >