Hi Richard,
On 01/07/15 14:03, Richard Biener wrote:
This merges the complete comparison patterns from the match-and-simplify
branch, leaving incomplete implementations of fold-const.c code alone.
Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.
Richard.
2015-07-01 Richard Biener <rguent...@suse.de>
* fold-const.c (fold_comparison): Move X - Y CMP 0 -> X CMP Y,
X * C1 CMP 0 -> X CMP 0, X CMP X, ~X CMP ~Y -> Y CMP X and
~X CMP C -> X CMP' ~C to ...
* match.pd: ... patterns here.
+/* Transform comparisons of the form X - Y CMP 0 to X CMP Y.
+ ??? The transformation is valid for the other operators if overflow
+ is undefined for the type, but performing it here badly interacts
+ with the transformation in fold_cond_expr_with_comparison which
+ attempts to synthetize ABS_EXPR. */
+(for cmp (eq ne)
+ (simplify
+ (cmp (minus @0 @1) integer_zerop)
+ (cmp @0 @1)))
This broke some tests on aarch64:
FAIL: gcc.target/aarch64/subs.c scan-assembler subs\tw[0-9]
FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tw[0-9]+, w[0-9]+, w[0-9]+
FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tw[0-9]+, w[0-9]+,
w[0-9]+, lsl 3
FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tx[0-9]+, x[0-9]+, x[0-9]+
FAIL: gcc.target/aarch64/subs1.c scan-assembler subs\tx[0-9]+, x[0-9]+,
x[0-9]+, lsl 3
To take subs.c as an example:
There's something odd going on:
The X - Y CMP 0 -> X CMP Y transformation gets triggered only for the int case
but
not the long long case, but the int case (foo) is the place where the rtl ends
up being:
(insn 9 4 10 2 (set (reg/v:SI 74 [ l ])
(minus:SI (reg/v:SI 76 [ x ])
(reg/v:SI 77 [ y ]))) subs.c:9 254 {subsi3}
(nil))
(insn 10 9 11 2 (set (reg:CC 66 cc)
(compare:CC (reg/v:SI 76 [ x ])
(reg/v:SI 77 [ y ])))
instead of the previous:
(insn 9 4 10 2 (set (reg/v:SI 74 [ l ])
(minus:SI (reg/v:SI 76 [ x ])
(reg/v:SI 77 [ y ]))) subs.c:9 254 {subsi3}
(insn 10 9 11 2 (set (reg:CC 66 cc)
(compare:CC (reg/v:SI 74 [ l ])
(const_int 0 [0])))
so the tranformed X CMP Y does not get matched by combine into a subs.
Was the transformation before the patch in fold-const.c not getting triggered?
In aarch64 we have patterns to match:
[(set (reg:CC_NZ CC_REGNUM)
(compare:CC_NZ (minus:GPI (match_operand:GPI 1 "register_operand" "r")
(match_operand:GPI 2 "register_operand" "r"))
(const_int 0)))
(set (match_operand:GPI 0 "register_operand" "=r")
(minus:GPI (match_dup 1) (match_dup 2)))]
Should we add a pattern to match:
[(set (reg:CC CC_REGNUM)
(compare:CC (match_operand:GPI 1 "register_operand" "r")
(match_operand:GPI 2 "register_operand" "r")))
(set (match_operand:GPI 0 "register_operand" "=r")
(minus:GPI (match_dup 1) (match_dup 2)))]
as well?
Kyrill
+
+/* Transform comparisons of the form X * C1 CMP 0 to X CMP 0 in the
+ signed arithmetic case. That form is created by the compiler
+ often enough for folding it to be of value. One example is in
+ computing loop trip counts after Operator Strength Reduction. */
+(for cmp (tcc_comparison)
+ scmp (swapped_tcc_comparison)
+ (simplify
+ (cmp (mult @0 INTEGER_CST@1) integer_zerop@2)
+ /* Handle unfolded multiplication by zero. */
+ (if (integer_zerop (@1))
+ (cmp @1 @2))
+ (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0))
+ && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)))
+ /* If @1 is negative we swap the sense of the comparison. */
+ (if (tree_int_cst_sgn (@1) < 0)
+ (scmp @0 @2))
+ (cmp @0 @2))))
+
+/* Simplify comparison of something with itself. For IEEE
+ floating-point, we can only do some of these simplifications. */
+(simplify
+ (eq @0 @0)
+ (if (! FLOAT_TYPE_P (TREE_TYPE (@0))
+ || ! HONOR_NANS (TYPE_MODE (TREE_TYPE (@0))))
+ { constant_boolean_node (true, type); }))
+(for cmp (ge le)
+ (simplify
+ (cmp @0 @0)
+ (eq @0 @0)))
+(for cmp (ne gt lt)
+ (simplify
+ (cmp @0 @0)
+ (if (cmp != NE_EXPR
+ || ! FLOAT_TYPE_P (TREE_TYPE (@0))
+ || ! HONOR_NANS (TYPE_MODE (TREE_TYPE (@0))))
+ { constant_boolean_node (false, type); })))
+
+/* Fold ~X op ~Y as Y op X. */
+(for cmp (tcc_comparison)
+ (simplify
+ (cmp (bit_not @0) (bit_not @1))
+ (cmp @1 @0)))
+
+/* Fold ~X op C as X op' ~C, where op' is the swapped comparison. */
+(for cmp (tcc_comparison)
+ scmp (swapped_tcc_comparison)
+ (simplify
+ (cmp (bit_not @0) CONSTANT_CLASS_P@1)
+ (if (TREE_CODE (@1) == INTEGER_CST || TREE_CODE (@1) == VECTOR_CST)
+ (scmp @0 (bit_not @1)))))
+
+
/* Unordered tests if either argument is a NaN. */
(simplify
(bit_ior (unordered @0 @0) (unordered @1 @1))