Currently, make_extraction() identifies where we can emit an ASHIFT of an extend in place of an extraction, but fails to make the corresponding canonicalization/simplification when presented with a MULT by a power of two. Such a representation is canonical when representing a left-shifted address inside a MEM.
This patch remedies this situation: after the patch, make_extraction() now also identifies RTXs such as: (mult:DI (subreg:DI (reg:SI r)) (const_int 2^n)) and rewrites this as: (mult:DI (sign_extend:DI (reg:SI r)) (const_int 2^n)) instead of using a sign_extract. (This patch also fixes up a comment in expand_compound_operation() which appears to have suffered from bitrot.) This fixes PR96998: an ICE on AArch64 due to an unrecognised sign_extract insn which was exposed by r11-2903-g6b3034eaba83935d9f6dfb20d2efbdb34b5b00bf. That change introduced a canonicalisation in LRA to rewrite mult to shift in address reloads. Prior to this patch, the flow was as follows. We start with the following insn going into combine: (insn 9 8 10 3 (set (mem:SI (plus:DI (mult:DI (reg:DI 98 [ g ]) (const_int 4 [0x4])) (reg/f:DI 96)) [3 *i_5+0 S4 A32]) (asm_operands:SI ("") ("=Q") 0 [] [] [] test.c:11)) "test.c":11:5 -1 (expr_list:REG_DEAD (reg:DI 98 [ g ]) (nil))) Then combine turns this into a sign_extract: (insn 9 8 10 3 (set (mem:SI (plus:DI (sign_extract:DI (mult:DI (subreg:DI (reg/v:SI 92 [ g ]) 0) (const_int 4 [0x4])) (const_int 34 [0x22]) (const_int 0 [0])) (reg/f:DI 96)) [3 *i_5+0 S4 A32]) (asm_operands:SI ("") ("=Q") 0 [] [] [] test.c:11)) "test.c":11:5 -1 (expr_list:REG_DEAD (reg/v:SI 92 [ g ]) (nil))) Then LRA reloads the address and (prior to the LRA change) we get: (insn 32 8 9 3 (set (reg:DI 0 x0 [100]) (plus:DI (sign_extract:DI (mult:DI (reg:DI 0 x0 [orig:92 g ] [92]) (const_int 4 [0x4])) (const_int 34 [0x22]) (const_int 0 [0])) (reg/f:DI 19 x19 [96]))) "test.c":11:5 283 {*add_extvdi_multp2} (nil)) (insn 9 32 10 3 (set (mem:SI (reg:DI 0 x0 [100]) [3 *i_5+0 S4 A32]) (asm_operands:SI ("") ("=Q") 0 [] [] [] test.c:11)) "test.c":11:5 -1 (nil)) Now observe that insn 32 here is not canonical: firstly, we should be using an ASHIFT by 2 instead of a MULT by 4, since we're outside of a MEM. Indeed, the LRA change remedies this, and support for such insns in the AArch64 backend was dropped in r11-3033-g2f8ae301f6a125f50b0a758047fcddae7b68daa8. Now the reason we ICE after the LRA change here is that AArch64 has never supported the ASHIFT variant of this sign_extract insn. Inspecting the unrecognised reloaded insn confirms this: (gdb) p debug(insn) (insn 33 8 34 3 (set (reg:DI 100) (sign_extract:DI (ashift:DI (subreg:DI (reg/v:SI 92 [ g ]) 0) (const_int 2 [0x2])) (const_int 34 [0x22]) (const_int 0 [0]))) "test.c":11:5 -1 (nil)) The thesis of this patch is that combine should _never_ be producing such an insn. Clearly this should be canonicalised as an extend operation instead (as combine already does in make_extraction() for the ASHIFT form). After this change to combine, we get: (insn 9 8 10 3 (set (mem:SI (plus:DI (mult:DI (sign_extend:DI (reg/v:SI 92 [ g ])) (const_int 4 [0x4])) (reg/f:DI 96)) [3 *i_5+0 S4 A32]) (asm_operands:SI ("") ("=Q") 0 [] [] [] test.c:11)) "test.c":11:5 -1 (expr_list:REG_DEAD (reg/v:SI 92 [ g ]) (nil))) coming out of combine, and LRA can happily reload the address: (insn 32 8 9 3 (set (reg:DI 0 x0 [100]) (plus:DI (ashift:DI (sign_extend:DI (reg/v:SI 0 x0 [orig:92 g ] [92])) (const_int 2 [0x2])) (reg/f:DI 19 x19 [96]))) "test.c":11:5 245 {*add_extendsi_shft_di} (nil)) (insn 9 32 10 3 (set (mem:SI (reg:DI 0 x0 [100]) [3 *i_5+0 S4 A32]) (asm_operands:SI ("") ("=Q") 0 [] [] [] test.c:11)) "test.c":11:5 -1 (nil)) and all is well, with nice simple and canonical RTL being used throughout. Testing: * Bootstrap and regtest on aarch64-linux-gnu, arm-linux-gnueabihf, and x86-linux-gnu in progress. OK for trunk (with AArch64 changes discussed here [0] as a follow-on patch) provided it passes testing? Thanks, Alex [0] : https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554257.html --- gcc/ChangeLog: PR target/96998 * combine.c (expand_compound_operation): Tweak variable name in comment to match source. (make_extraction): Handle mult by power of two in addition to ashift. gcc/testsuite/ChangeLog: PR target/96998 * gcc.c-torture/compile/pr96998.c: New test.
diff --git a/gcc/combine.c b/gcc/combine.c index c88382efbd3..fe8eff2b464 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -7419,8 +7419,8 @@ expand_compound_operation (rtx x) } /* If we reach here, we want to return a pair of shifts. The inner - shift is a left shift of BITSIZE - POS - LEN bits. The outer - shift is a right shift of BITSIZE - LEN bits. It is arithmetic or + shift is a left shift of MODEWIDTH - POS - LEN bits. The outer + shift is a right shift of MODEWIDTH - LEN bits. It is arithmetic or logical depending on the value of UNSIGNEDP. If this was a ZERO_EXTEND or ZERO_EXTRACT, this pair of shifts will be @@ -7650,20 +7650,27 @@ make_extraction (machine_mode mode, rtx inner, HOST_WIDE_INT pos, is_mode = GET_MODE (SUBREG_REG (inner)); inner = SUBREG_REG (inner); } - else if (GET_CODE (inner) == ASHIFT + else if ((GET_CODE (inner) == ASHIFT || GET_CODE (inner) == MULT) && CONST_INT_P (XEXP (inner, 1)) - && pos_rtx == 0 && pos == 0 - && len > UINTVAL (XEXP (inner, 1))) - { - /* We're extracting the least significant bits of an rtx - (ashift X (const_int C)), where LEN > C. Extract the - least significant (LEN - C) bits of X, giving an rtx - whose mode is MODE, then shift it left C times. */ - new_rtx = make_extraction (mode, XEXP (inner, 0), - 0, 0, len - INTVAL (XEXP (inner, 1)), - unsignedp, in_dest, in_compare); - if (new_rtx != 0) - return gen_rtx_ASHIFT (mode, new_rtx, XEXP (inner, 1)); + && pos_rtx == 0 && pos == 0) + { + const HOST_WIDE_INT ci = INTVAL (XEXP (inner, 1)); + const auto code = GET_CODE (inner); + const HOST_WIDE_INT shift_amt = (code == MULT) ? exact_log2 (ci) : ci; + + if (shift_amt > 0 && len > (unsigned HOST_WIDE_INT)shift_amt) + { + /* We're extracting the least significant bits of an rtx + (ashift X (const_int C)) or (mult X (const_int (2^C))), + where LEN > C. Extract the least significant (LEN - C) bits + of X, giving an rtx whose mode is MODE, then shift it left + C times. */ + new_rtx = make_extraction (mode, XEXP (inner, 0), + 0, 0, len - shift_amt, + unsignedp, in_dest, in_compare); + if (new_rtx) + return gen_rtx_fmt_ee (code, mode, new_rtx, XEXP (inner, 1)); + } } else if (GET_CODE (inner) == TRUNCATE /* If trying or potentionally trying to extract diff --git a/gcc/testsuite/gcc.c-torture/compile/pr96998.c b/gcc/testsuite/gcc.c-torture/compile/pr96998.c new file mode 100644 index 00000000000..a75d5dcfe08 --- /dev/null +++ b/gcc/testsuite/gcc.c-torture/compile/pr96998.c @@ -0,0 +1,15 @@ +/* { dg-do compile { target arm*-*-* aarch64*-*-* } } */ + +int h(void); +struct c d; +struct c { + int e[1]; +}; + +void f(void) { + int g; + for (;; g = h()) { + int *i = &d.e[g]; + asm("" : "=Q"(*i)); + } +}