On Mon, 6 Jul 2015, Kyrill Tkachov wrote: > Hi Richard, > > On 01/07/15 14:03, Richard Biener wrote: > > This merges the complete comparison patterns from the match-and-simplify > > branch, leaving incomplete implementations of fold-const.c code alone. > > > > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. > > > > Richard. > > > > 2015-07-01 Richard Biener <rguent...@suse.de> > > > > * fold-const.c (fold_comparison): Move X - Y CMP 0 -> X CMP Y, > > X * C1 CMP 0 -> X CMP 0, X CMP X, ~X CMP ~Y -> Y CMP X and > > ~X CMP C -> X CMP' ~C to ... > > * match.pd: ... patterns here. > > > > > > +/* Transform comparisons of the form X - Y CMP 0 to X CMP Y. > > + ??? The transformation is valid for the other operators if overflow > > + is undefined for the type, but performing it here badly interacts > > + with the transformation in fold_cond_expr_with_comparison which > > + attempts to synthetize ABS_EXPR. */ > > +(for cmp (eq ne) > > + (simplify > > + (cmp (minus @0 @1) integer_zerop) > > + (cmp @0 @1))) > > This broke some tests on aarch64: > FAIL: gcc.target/aarch64/subs.c scan-assembler subs\tw[0-9] > FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tw[0-9]+, w[0-9]+, > w[0-9]+ > FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tw[0-9]+, w[0-9]+, > w[0-9]+, lsl 3 > FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tx[0-9]+, x[0-9]+, > x[0-9]+ > FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tx[0-9]+, x[0-9]+, > x[0-9]+, lsl 3 > > To take subs.c as an example: > There's something odd going on: > The X - Y CMP 0 -> X CMP Y transformation gets triggered only for the int case > but > not the long long case, but the int case (foo) is the place where the rtl ends > up being: > > (insn 9 4 10 2 (set (reg/v:SI 74 [ l ]) > (minus:SI (reg/v:SI 76 [ x ]) > (reg/v:SI 77 [ y ]))) subs.c:9 254 {subsi3} > (nil)) > (insn 10 9 11 2 (set (reg:CC 66 cc) > (compare:CC (reg/v:SI 76 [ x ]) > (reg/v:SI 77 [ y ]))) > > instead of the previous: > > (insn 9 4 10 2 (set (reg/v:SI 74 [ l ]) > (minus:SI (reg/v:SI 76 [ x ]) > (reg/v:SI 77 [ y ]))) subs.c:9 254 {subsi3} > > (insn 10 9 11 2 (set (reg:CC 66 cc) > (compare:CC (reg/v:SI 74 [ l ]) > (const_int 0 [0]))) > > > so the tranformed X CMP Y does not get matched by combine into a subs. > Was the transformation before the patch in fold-const.c not getting triggered?
It was prevented from getting triggered by restricting the transform to single uses (a fix I am testing right now). Note that in case you'd write int l = x - y; if (l == 0) return 5; /* { dg-final { scan-assembler "subs\tw\[0-9\]" } } */ z = x - y ; the simplification will happen anyway because the redundancy computing z has not yet been eliminated (a reason why such single-use checks are not 100% the very much "correct" thing to do). > In aarch64 we have patterns to match: > [(set (reg:CC_NZ CC_REGNUM) > (compare:CC_NZ (minus:GPI (match_operand:GPI 1 "register_operand" "r") > (match_operand:GPI 2 "register_operand" "r")) > (const_int 0))) > (set (match_operand:GPI 0 "register_operand" "=r") > (minus:GPI (match_dup 1) (match_dup 2)))] > > > Should we add a pattern to match: > [(set (reg:CC CC_REGNUM) > (compare:CC (match_operand:GPI 1 "register_operand" "r") > (match_operand:GPI 2 "register_operand" "r"))) > (set (match_operand:GPI 0 "register_operand" "=r") > (minus:GPI (match_dup 1) (match_dup 2)))] > > as well? No, I don't think so. Richard. > Kyrill > > > + > > +/* Transform comparisons of the form X * C1 CMP 0 to X CMP 0 in the > > + signed arithmetic case. That form is created by the compiler > > + often enough for folding it to be of value. One example is in > > + computing loop trip counts after Operator Strength Reduction. */ > > +(for cmp (tcc_comparison) > > + scmp (swapped_tcc_comparison) > > + (simplify > > + (cmp (mult @0 INTEGER_CST@1) integer_zerop@2) > > + /* Handle unfolded multiplication by zero. */ > > + (if (integer_zerop (@1)) > > + (cmp @1 @2)) > > + (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0)) > > + && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0))) > > + /* If @1 is negative we swap the sense of the comparison. */ > > + (if (tree_int_cst_sgn (@1) < 0) > > + (scmp @0 @2)) > > + (cmp @0 @2)))) > > + > > +/* Simplify comparison of something with itself. For IEEE > > + floating-point, we can only do some of these simplifications. */ > > +(simplify > > + (eq @0 @0) > > + (if (! FLOAT_TYPE_P (TREE_TYPE (@0)) > > + || ! HONOR_NANS (TYPE_MODE (TREE_TYPE (@0)))) > > + { constant_boolean_node (true, type); })) > > +(for cmp (ge le) > > + (simplify > > + (cmp @0 @0) > > + (eq @0 @0))) > > +(for cmp (ne gt lt) > > + (simplify > > + (cmp @0 @0) > > + (if (cmp != NE_EXPR > > + || ! FLOAT_TYPE_P (TREE_TYPE (@0)) > > + || ! HONOR_NANS (TYPE_MODE (TREE_TYPE (@0)))) > > + { constant_boolean_node (false, type); }))) > > + > > +/* Fold ~X op ~Y as Y op X. */ > > +(for cmp (tcc_comparison) > > + (simplify > > + (cmp (bit_not @0) (bit_not @1)) > > + (cmp @1 @0))) > > + > > +/* Fold ~X op C as X op' ~C, where op' is the swapped comparison. */ > > +(for cmp (tcc_comparison) > > + scmp (swapped_tcc_comparison) > > + (simplify > > + (cmp (bit_not @0) CONSTANT_CLASS_P@1) > > + (if (TREE_CODE (@1) == INTEGER_CST || TREE_CODE (@1) == VECTOR_CST) > > + (scmp @0 (bit_not @1))))) > > + > > + > > /* Unordered tests if either argument is a NaN. */ > > (simplify > > (bit_ior (unordered @0 @0) (unordered @1 @1)) > > > > -- Richard Biener <rguent...@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham Norton, HRB 21284 (AG Nuernberg)