On 5/20/19 3:51 PM, Uros Bizjak wrote: > On Mon, May 20, 2019 at 11:39 PM Jeff Law <l...@redhat.com> wrote: >> >> On 5/20/19 3:36 PM, H.J. Lu wrote: >>> With SSE emulation of MMX intrinsics in 64-bit mode, >>> >>> --- >>> __v8qi test () >>> { >>> __v8qi mm0 = {1,2,3,4,5,6,7,8}; >>> __v8qi mm1 = {11,22,33,44,55,66,77,88}; >>> volatile __m64 x; >>> >>> x = _mm_add_pi8 (mm0, mm1); >>> >>> return x; >>> } >>> --- >>> >>> is compiled into >>> >>> movq .LC0(%rip), %xmm0 >>> movq .LC1(%rip), %xmm1 >>> paddb %xmm1, %xmm0 >>> movq %xmm0, -8(%rsp) >>> movq -8(%rsp), %xmm0 >>> ret >>> >>> instead of >>> >>> movq .LC1(%rip), %mm0 >>> paddb .LC0(%rip), %mm0 >>> movq %mm0, -8(%rsp) >>> movq -8(%rsp), %xmm0 >>> ret >>> >>> Adjust gcc.target/i386/pr22076.c for 64-bit. >>> >>> * gcc.target/i386/pr22076.c: Adjusted for 64-bit. >> Well, it looks like you're just papering over a code quality regression? >> Or am I missing something? > > We have to load 64bit value from memory to 128 bit XMM register using movq. > > OTOH, we could just use -mno-sse2 which disables XMM emulation. Ah, we can't have a MEM operand on paddb for xmm registers...
Jeff