A customer reported the following missed opportunities to combine a couple instructions into a sbfiz.
int sbfiz32 (int x) { return x << 29 >> 10; } long sbfiz64 (long x) { return x << 58 >> 20; } This gets converted to the following pattern: (set (reg:SI 98) (ashift:SI (sign_extend:SI (reg:HI 0 x0 [ xD.3334 ])) (const_int 6 [0x6]))) Currently, gcc generates the following: sbfiz32: lsl x0, x0, 29 asr x0, x0, 10 ret sbfiz64: lsl x0, x0, 58 asr x0, x0, 20 ret It could generate this instead: sbfiz32: sbfiz w0, w0, 19, 3 ret sbfiz64:: sbfiz x0, x0, 38, 6 ret The unsigned versions already generate ubfiz for the same code, so the lack of a sbfiz pattern may have been an oversight. This particular sbfiz pattern shows up in both CPU2006 (~ 80 hits) and CPU2017 (~ 280 hits). It's not a lot, but seems beneficial in any case. No significant performance differences, probably due to the small number of occurrences or cases outside hot areas. Regression-tested and bootstrapped ok on aarch64-linux. Validated with CPU2017 and CPU2006 runs. I thought i'd put this up for review. I know we're still not in development mode yet. 2018-02-02 Luis Machado <luis.mach...@linaro.org> gcc/ * config/aarch64/aarch64.md (*ashift<mode>_extv_bfiz): New pattern. * testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c: New test. --- gcc/config/aarch64/aarch64.md | 13 +++++++++++++ gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c | 24 ++++++++++++++++++++++++ 2 files changed, 37 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 5a2a930..d336bf0 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -4828,6 +4828,19 @@ [(set_attr "type" "bfx")] ) +;; Match sbfiz pattern in a shift left + shift right operation. + +(define_insn "*ashift<mode>_extv_bfiz" + [(set (match_operand:GPI 0 "register_operand" "=r") + (ashift:GPI (sign_extract:GPI (match_operand:GPI 1 "register_operand" "r") + (match_operand 2 "const_int_operand" "n") + (match_operand 3 "const_int_operand" "n")) + (match_operand 4 "const_int_operand" "n")))] + "UINTVAL (operands[2]) < <GPI:sizen>" + "sbfiz\\t%<w>0, %<w>1, %4, %2" + [(set_attr "type" "bfx")] +) + ;; When the bit position and width of the equivalent extraction add up to 32 ;; we can use a W-reg LSL instruction taking advantage of the implicit ;; zero-extension of the X-reg. diff --git a/gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c b/gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c new file mode 100644 index 0000000..931f8f4 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/lsl_asr_sbfiz.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-options "-O3" } */ + +/* Check that a LSL followed by an ASR can be combined into a single SBFIZ + instruction. */ + +/* Using W-reg */ + +int +sbfiz32 (int x) +{ + return x << 29 >> 10; +} + +/* Using X-reg */ + +long +sbfiz64 (long x) +{ + return x << 58 >> 20; +} + +/* { dg-final { scan-assembler "sbfiz\tw" } } */ +/* { dg-final { scan-assembler "sbfiz\tx" } } */ -- 2.7.4