On Sun, Jul 9, 2023 at 11:30 PM Roger Sayle <ro...@nextmovesoftware.com> wrote: > > > This patch implements another of Uros' suggestions, to investigate a > insvti_lowpart_1 pattern to improve TImode parameter passing on x86_64. > In PR 88873, the RTL the middle-end expands for passing V2DF in TImode > is subtly different from what it does for V2DI in TImode, sufficiently so > that my explanations for why insvti_lowpart_1 isn't required don't apply > in this case. > > This patch adds an insvti_lowpart_1 pattern, complementing the existing > insvti_highpart_1 pattern, and also a 32-bit variant, insvdi_lowpart_1. > Because the middle-end represents 128-bit constants using CONST_WIDE_INT > and 64-bit constants using CONST_INT, it's easiest to treat these as > different patterns, rather than attempt <dwi> parameterization. > > This patch also includes a peephole2 (actually a pair) to transform > xchg instructions into mov instructions, when one of the destinations > is unused. This optimization is required to produce the optimal code > sequences below. > > For the 64-bit case: > > __int128 foo(__int128 x, unsigned long long y) > { > __int128 m = ~((__int128)~0ull); > __int128 t = x & m; > __int128 r = t | y; > return r; > } > > Before: > xchgq %rdi, %rsi > movq %rdx, %rax > xorl %esi, %esi > xorl %edx, %edx > orq %rsi, %rax > orq %rdi, %rdx > ret > > After: > movq %rdx, %rax > movq %rsi, %rdx > ret > > For the 32-bit case: > > long long bar(long long x, int y) > { > long long mask = ~0ull << 32; > long long t = x & mask; > long long r = t | (unsigned int)y; > return r; > } > > Before: > pushl %ebx > movl 12(%esp), %edx > xorl %ebx, %ebx > xorl %eax, %eax > movl 16(%esp), %ecx > orl %ebx, %edx > popl %ebx > orl %ecx, %eax > ret > > After: > movl 12(%esp), %eax > movl 8(%esp), %edx > ret > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > and make -k check, both with and without --target_board=unix{-m32} > with no new failures. Ok for mainline? > > > 2023-07-09 Roger Sayle <ro...@nextmovesoftware.com> > > gcc/ChangeLog > * config/i386/i386.md (peephole2): Transform xchg insn with a > REG_UNUSED note to a (simple) move. > (*insvti_lowpart_1): New define_insn_and_split. > (*insvdi_lowpart_1): Likewise. > > gcc/testsuite/ChangeLog > * gcc.target/i386/insvdi_lowpart-1.c: New test case. > * gcc.target/i386/insvti_lowpart-1.c: Likewise.
OK. Thanks, Uros. > > > Cheers, > Roger > -- >