The recent change in TImode parameter passing on x86_64 results in the FAIL of pr91681-1.c. The issue is that with the extra flexibility, the combine pass is now spoilt for choice between using either the *add<dwi>3_doubleword_concat or the *add<dwi>3_doubleword_zext patterns, when one operand is a *concat and the other is a zero_extend. The solution proposed below is provide an *add<dwi>3_doubleword_concat_zext define_insn_and_split, that can benefit both from the register allocation of *concat, and still avoid the xor normally required by zero extension.
I'm investigating a follow-up refinement to improve register allocation further by avoiding the early clobber in the =&r, and handling (custom) reloads explicitly, but this piece resolves the testcase failure. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2023-07-11 Roger Sayle <ro...@nextmovesoftware.com> gcc/ChangeLog PR target/91681 * config/i386/i386.md (*add<dwi>3_doubleword_concat_zext): New define_insn_and_split derived from *add<dwi>3_doubleword_concat and *add<dwi>3_doubleword_zext. Thanks, Roger --
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index e47ced1..ca6977f 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -6222,6 +6222,39 @@ (clobber (reg:CC FLAGS_REG))])] "split_double_mode (<DWI>mode, &operands[0], 2, &operands[0], &operands[5]);") +(define_insn_and_split "*add<dwi>3_doubleword_concat_zext" + [(set (match_operand:<DWI> 0 "register_operand" "=&r") + (plus:<DWI> + (any_or_plus:<DWI> + (ashift:<DWI> + (zero_extend:<DWI> + (match_operand:DWIH 2 "nonimmediate_operand" "rm")) + (match_operand:QI 3 "const_int_operand")) + (zero_extend:<DWI> + (match_operand:DWIH 4 "nonimmediate_operand" "rm"))) + (zero_extend:<DWI> + (match_operand:DWIH 1 "nonimmediate_operand" "rm"))) + (clobber (reg:CC FLAGS_REG))] + "INTVAL (operands[3]) == <MODE_SIZE> * BITS_PER_UNIT" + "#" + "&& reload_completed" + [(set (match_dup 0) (match_dup 4)) + (set (match_dup 5) (match_dup 2)) + (parallel [(set (reg:CCC FLAGS_REG) + (compare:CCC + (plus:DWIH (match_dup 0) (match_dup 1)) + (match_dup 0))) + (set (match_dup 0) + (plus:DWIH (match_dup 0) (match_dup 1)))]) + (parallel [(set (match_dup 5) + (plus:DWIH + (plus:DWIH + (ltu:DWIH (reg:CC FLAGS_REG) (const_int 0)) + (match_dup 5)) + (const_int 0))) + (clobber (reg:CC FLAGS_REG))])] + "split_double_mode (<DWI>mode, &operands[0], 1, &operands[0], &operands[5]);") + (define_insn "*add<mode>_1" [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,r,r") (plus:SWI48