On Tue, Dec 22, 2015 at 12:39 AM, Jeff Law <l...@redhat.com> wrote: > > As outlined in the BZ, given > > (x & y) & C > > reassociation will turn that into > > (x & C) & y > > Which inhibits bit operations on various targets. > > Associating as > > (x & y) & C > > is what we want. > > My original patch did this for all binary operations. However, reviewing > the assembly code & dump files before/after it was pretty clear that doing > this for arithmetic was losing (mostly in that we were missing CSE > opportunities).
The missed CSE opportunities really are a CSE missed optimization in general with respect to two reassociatable chains with some (but not all) common operands. > Restricting to logicals means there's far fewer triggering cases, but in > every one I looked at the resultant code was clearly better. > > The testcase is a bit unusual in that it's in tree-ssa. It checks the > output of reassociation. However, on x86 and m68k it also checks the > resulting assembly code, which I believe is unique in the tree-ssa > directory. > > Bootstrapped and regression tested on x86_64-linux-gnu. The test has been > verified on x86_64-linux-gnu, i686-linux-gnu and m68k-linux-gnu. > > OK for the trunk? So you are really trying to make RTL expansion see a different pattern? It seems to me that in this case using TER and get_def_for_expr / get_gimple_for_ssa_name should be used there. [or for future direction instruction selection should be performed on GIMPLE with some match-and-simplify patterns creating IFNs matching optabs if available] Richard. > diff --git a/gcc/ChangeLog b/gcc/ChangeLog > index c7626ff..f5ca85e 100644 > --- a/gcc/ChangeLog > +++ b/gcc/ChangeLog > @@ -1,3 +1,9 @@ > +2015-12-20 Jeff Law <l...@redhat.com> > + > + PR tree-optimization/64910 > + * tree-ssa-reassoc.c (swap_ops_for_binary_stmt): Make sure constant > + is last for binary bit operations. > + > 2015-12-21 Pierre-Marie de Rodat <dero...@adacore.com> > > * dwarf2out.c (add_data_member_location_attribute): Do not > diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog > index bb2ed22..e2f55f3 100644 > --- a/gcc/testsuite/ChangeLog > +++ b/gcc/testsuite/ChangeLog > @@ -1,3 +1,8 @@ > +2015-12-20 Jeff Law <l...@redhat.com> > + > + PR tree-optimization/64910 > + * gcc.dg/tree-ssa/pr64910-2.c.c: New test. > + > 2015-12-21 Claudiu Zissulescu <claz...@synopsys.com> > > * gcc.target/arc/builtin_general.c: New test. > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr64910-2.c > b/gcc/testsuite/gcc.dg/tree-ssa/pr64910-2.c > new file mode 100644 > index 0000000..2e3d679 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr64910-2.c > @@ -0,0 +1,85 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-reassoc1" } */ > + > +/* We want to make sure that we reassociate in a way that has the > + constant last. With the constant last, it's more likely to result > + in a bitfield test on targets with such capabilities. */ > + > +extern void boo (); > + > +int b2b_uc (unsigned char u, unsigned char w) > +{ > + if ((u & w) & 0x20) > + boo (); > +} > + > +int b2b_us (unsigned short u, unsigned short w) > +{ > + if ((u & w) & 0x20) > + boo (); > +} > + > +int b2b_ui (unsigned int u, unsigned int w) > +{ > + if ((u & w) & 0x20) > + boo (); > +} > +int b2b_ul (unsigned long u, unsigned long w) > +{ > + if ((u & w) & 0x20) > + boo (); > +} > +int b2b_ull (unsigned long long u, unsigned long long w) > +{ > + if ((u & w) & 0x20) > + boo (); > +} > + > +int b2b_sc (signed char u, signed char w) > +{ > + if ((u & w) & 0x20) > + boo (); > +} > + > +int b2b_ss (signed short u, signed short w) > +{ > + if ((u & w) & 0x20) > + boo (); > +} > + > +int b2b_si (signed int u, signed int w) > +{ > + if ((u & w) & 0x20) > + boo (); > +} > +int b2b_sl (signed long u, signed long w) > +{ > + if ((u & w) & 0x20) > + boo (); > +} > +int b2b_sll (signed long long u, signed long long w) > +{ > + if ((u & w) & 0x20) > + boo (); > +} > + > +/* The AND of U & W should go into a temporary, when is then ANDed > + with the constant. > + > + First verify that we have the right number of ANDs between U and W. */ > +/* { dg-final { scan-tree-dump-times "\[uw\]_\[0-9\]+.D. \& > \[uw\]_\[0-9\]+.D.;" 10 "reassoc1"} } */ > + > +/* Then verify that we have the right number of ANDS between a temporary > + and the constant. */ > +/* { dg-final { scan-tree-dump-times "_\[0-9]+ \& 32;" 10 "reassoc1"} } */ > + > +/* Each function has one AND. It will have either a second AND or TEST. > So > + we can count the number of AND and TEST instructions. They must be 2X > + the number of test functions in this file. */ > +/* { dg-final { scan-assembler-times "and|test" 20 { target { i?86-*-* > x86_64-*-*} } } } */ > + > +/* Similarly on the m68k. The code for the long long tests is suboptimal, > + which catch via the second pattern and xfail. */ > +/* { dg-final { scan-assembler-times "and|btst" 20 { target { m68k-*-* } } > } } */ > +/* { dg-final { scan-assembler-not "or" { target { m68k-*-* } xfail { *-*-* > } } } } */ > + > diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c > index e54700e..1dcfce3 100644 > --- a/gcc/tree-ssa-reassoc.c > +++ b/gcc/tree-ssa-reassoc.c > @@ -3449,6 +3449,26 @@ swap_ops_for_binary_stmt (vec<operand_entry *> ops, > oe1->op = temp.op; > oe1->rank = temp.rank; > } > + /* For X OP Y OP C, associate as (X OP Y) OP C, but only for > + logicals, which encourages bit operations. Experimentation > + has shown this generally isn't a win for arithmetic. */ > + else if (stmt) > + { > + enum tree_code opcode = gimple_assign_rhs_code (stmt); > + if ((opcode == BIT_AND_EXPR > + || opcode == BIT_IOR_EXPR > + || opcode == BIT_XOR_EXPR) > + && TREE_CODE (oe1->op) != INTEGER_CST > + && TREE_CODE (oe2->op) != INTEGER_CST > + && TREE_CODE (oe3->op) == INTEGER_CST) > + { > + operand_entry temp = *oe3; > + oe3->op = oe1->op; > + oe3->rank = oe1->rank; > + oe1->op = temp.op; > + oe1->rank= temp.rank; > + } > + } > } > > /* If definition of RHS1 or RHS2 dominates STMT, return the later of those >