Hi all, In a similar rationale to patch 1/3 this patch changes the AArch64 backend to keep the CTZ expression as a single RTX until after reload when it is split into an RBIT and a CLZ instruction. This enables CTZ-specific optimisations in the pre-reload RTL optimisers.
Bootstrapped and tested on aarch64-none-linux-gnu. Ok for trunk? Thanks, Kyrill 2016-05-26 Kyrylo Tkachov <kyrylo.tkac...@arm.com> PR middle-end/37780 * config/aarch64/aarch64.md (ctz<mode>2): Convert to define_insn_and_split.
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index a9e811e9f70f650fb9292b6d9a96ef4b2dbbaec6..7b3e2cd13bdcc05defda1e3ff74bf003443fe70f 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -3790,16 +3790,23 @@ (define_insn "rbit<mode>2" [(set_attr "type" "rbit")] ) -(define_expand "ctz<mode>2" - [(match_operand:GPI 0 "register_operand") - (match_operand:GPI 1 "register_operand")] +;; Split after reload into RBIT + CLZ. Since RBIT is represented as an UNSPEC +;; it is unlikely to fold with any other operation, so keep this as a CTZ +;; expression and split after reload to enable scheduling them apart if +;; needed. + +(define_insn_and_split "ctz<mode>2" + [(set (match_operand:GPI 0 "register_operand" "=r") + (ctz:GPI (match_operand:GPI 1 "register_operand" "r")))] "" - { - emit_insn (gen_rbit<mode>2 (operands[0], operands[1])); - emit_insn (gen_clz<mode>2 (operands[0], operands[0])); - DONE; - } -) + "#" + "reload_completed" + [(const_int 0)] + " + emit_insn (gen_rbit<mode>2 (operands[0], operands[1])); + emit_insn (gen_clz<mode>2 (operands[0], operands[0])); + DONE; +") (define_insn "*and<mode>_compare0" [(set (reg:CC_NZ CC_REGNUM)