On Wed, Jun 7, 2023 at 8:32 AM Uros Bizjak <ubiz...@gmail.com> wrote: > > On Wed, Jun 7, 2023 at 1:05 AM Roger Sayle <ro...@nextmovesoftware.com> wrote: > > > > > > This patch addresses the last remaining issue with PR target/31985, that > > GCC could make better use of memory addressing modes when implementing > > double word addition. This is achieved by adding a define_insn_and_split > > that combines an *add<dwi>3_doubleword with a *concat<mode><dwi>3, so > > that the components of the concat can be used directly, without first > > being loaded into a double word register. > > > > For test_c in the bugzilla PR: > > > > Before: > > pushl %ebx > > subl $16, %esp > > movl 28(%esp), %eax > > movl 36(%esp), %ecx > > movl 32(%esp), %ebx > > movl 24(%esp), %edx > > addl %ecx, %eax > > adcl %ebx, %edx > > movl %eax, 8(%esp) > > movl %edx, 12(%esp) > > addl $16, %esp > > popl %ebx > > ret > > > > After: > > test_c: > > subl $20, %esp > > movl 36(%esp), %eax > > movl 32(%esp), %edx > > addl 28(%esp), %eax > > adcl 24(%esp), %edx > > movl %eax, 8(%esp) > > movl %edx, 12(%esp) > > addl $20, %esp > > ret > > > > > > If this approach is considered acceptable, similar splitters can be > > used for other doubleword operations. > > Yes, the approach looks good to me. But please take care not to > interfere with STV transformations.
Hm, I have one second thought: +(define_insn_and_split "*add<dwi>3_doubleword_concat" + [(set (match_operand:<DWI> 0 "register_operand" "=r") + (plus:<DWI> + (any_or_plus:<DWI> + (ashift:<DWI> + (zero_extend:<DWI> + (match_operand:DWIH 2 "nonimmediate_operand" "rm")) + (match_operand:<DWI> 3 "const_int_operand")) + (zero_extend:<DWI> + (match_operand:DWIH 4 "nonimmediate_operand" "rm"))) + (match_operand:<DWI> 1 "register_operand" "0"))) + (clobber (reg:CC FLAGS_REG))] + "INTVAL (operands[3]) == <MODE_SIZE> * BITS_PER_UNIT" + "#" + "&& reload_completed" + [(parallel [(set (reg:CCC FLAGS_REG) + (compare:CCC + (plus:DWIH (match_dup 1) (match_dup 4)) + (match_dup 1))) + (set (match_dup 0) + (plus:DWIH (match_dup 1) (match_dup 4)))]) + (parallel [(set (match_dup 5) + (plus:DWIH + (plus:DWIH + (ltu:DWIH (reg:CC FLAGS_REG) (const_int 0)) + (match_dup 6)) + (match_dup 2))) + (clobber (reg:CC FLAGS_REG))])] + "split_double_mode (<DWI>mode, &operands[0], 2, &operands[0], &operands[5]);") The register allocator considers the instruction-to-be-split as one instruction, so it can allocate output register to match an input register (or a register that forms an input address), So, you have to either add an early clobber to the output, or somehow prevent output to clobber registers in the second pattern. Uros. > > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > > and make -k check, both with and without --target_board=unix{-m32} > > with no new failures. Ok for mainline? > > > > 2023-06-07 Roger Sayle <ro...@nextmovesoftware.com> > > > > gcc/ChangeLog > > PR target/31985 > > * config/i386/i386.md (*add<dwi>3_doubleword_concat): New > > define_insn_and_split combine *add<dwi>3_doubleword with a > > *concat<mode><dwi>3 for more efficient lowering after reload. > > > > gcc/testsuite/ChangeLog > > PR target/31985 > > * gcc.target/i386/pr31985.c: New test case. > > OK. > > Thanks, > Uros. > > > > > > > Roger > > -- > >