On Sun, Nov 28, 2021 at 3:02 PM Roger Sayle <ro...@nextmovesoftware.com> wrote: > > > This patch builds on the recent improvements to TImode rotations (and > Jakub's fixes to shldq/shrdq patterns). Now that expanding a TImode > rotation can never fail, it is safe to allow general_operand constraints > on the QImode shift amounts in rotlv1ti3 and rotrv1ti3 patterns. > I've also made an additional tweak to ix86_expand_v1ti_to_ti to use > vec_extract via V2DImode, which avoid using memory and takes advantage > vpextrq on recent hardware. > > For the following test case: > > typedef unsigned __int128 uv1ti __attribute__ ((__vector_size__ (16))); > uv1ti rotr(uv1ti x, unsigned int i) { return (x >> i) | (x << (128-i)); } > > GCC with -O2 -mavx2 would previously generate: > > rotr: vmovdqa %xmm0, -24(%rsp) > movq -16(%rsp), %rdx > movl %edi, %ecx > xorl %esi, %esi > movq -24(%rsp), %rax > shrdq %rdx, %rax > shrq %cl, %rdx > testb $64, %dil > cmovne %rdx, %rax > cmovne %rsi, %rdx > negl %ecx > xorl %edi, %edi > andl $127, %ecx > vmovq %rax, %xmm2 > movq -24(%rsp), %rax > vpinsrq $1, %rdx, %xmm2, %xmm1 > movq -16(%rsp), %rdx > shldq %rax, %rdx > salq %cl, %rax > testb $64, %cl > cmovne %rax, %rdx > cmovne %rdi, %rax > vmovq %rax, %xmm3 > vpinsrq $1, %rdx, %xmm3, %xmm0 > vpor %xmm1, %xmm0, %xmm0 > ret > > with this patch, we now generate: > > rotr: movl %edi, %ecx > vpextrq $1, %xmm0, %rax > vmovq %xmm0, %rdx > shrdq %rax, %rdx > vmovq %xmm0, %rsi > shrdq %rsi, %rax > andl $64, %ecx > movq %rdx, %rsi > cmovne %rax, %rsi > cmove %rax, %rdx > vmovq %rsi, %xmm0 > vpinsrq $1, %rdx, %xmm0, %xmm0 > ret > > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > and make -k check with no new failures. Ok for mainline? > > > 2021-11-28 Roger Sayle <ro...@nextmovesoftware.com> > > gcc/ChangeLog > * config/i386/i386-expand.c (ix86_expand_v1ti_to_ti): Perform the > conversion via V2DImode using vec_extractv2didi on TARGET_SSE2. > * config/i386/sse.md (rotlv1ti3, rotrv1ti3): Change constraint > on QImode shift amounts from const_int_operand to general_operand. > > gcc/testsuite/ChangeLog > * gcc.target/i386/sse2-v1ti-rotate.c: New test case.
OK. Thanks, Uros. > > > Thanks in advance, > Roger > -- >