on GCC-trunk/Cygwin/Core2 I observe the following behaviour. g++ -std=gnu++0x -O2 -m32 -march=native -msse -msse2 -msse3 -Wall -Werror -Wno-unused -Wno-strict-aliasing -march=native -fomit-frame-pointer -Wno-pmf-conversions -g main.cpp
-----------8<-------------- #include <x86intrin.h> int test1(__m128i v) { return _mm_cvtsi128_si32(v); } -----------8<-------------- emits: 004012e0 <__Z5test1U8__vectorx>: 4012e0: 83 ec 0c sub $0xc,%esp 4012e3: 66 0f 7e c0 movd %xmm0,%eax 4012e7: 83 c4 0c add $0xc,%esp 4012ea: c3 ret which shows that the stack pointer is being updated without any purpose. GCC also happens to lose the consition codes, as shown here: 4011a0: 66 0f df 01 pandn (%ecx),%xmm0 4011a4: 39 d9 cmp %ebx,%ecx 4011a6: 66 0f 7f 0c 24 movdqa %xmm1,(%esp) 4011ab: 75 04 jne 4011b1 <__Z8popcountPKU8__vectorxjj+0x61> 4011ad: 66 0f db c1 pand %xmm1,%xmm0 4011b1: 66 0f 6f 1d 90 28 40 movdqa 0x402890,%xmm3 4011b8: 00 4011b9: 66 0f 6f 15 a0 28 40 movdqa 0x4028a0,%xmm2 4011c0: 00 4011c1: 66 0f 6f f3 movdqa %xmm3,%xmm6 4011c5: 66 0f 6f fb movdqa %xmm3,%xmm7 4011c9: 66 0f db f0 pand %xmm0,%xmm6 4011cd: 66 0f df f8 pandn %xmm0,%xmm7 4011d1: 66 0f 6f ca movdqa %xmm2,%xmm1 4011d5: 66 0f 6f c7 movdqa %xmm7,%xmm0 4011d9: 66 0f 38 00 ce pshufb %xmm6,%xmm1 4011de: 66 0f 71 d0 04 psrlw $0x4,%xmm0 4011e3: 66 0f 6f f1 movdqa %xmm1,%xmm6 4011e7: 66 0f 6f fa movdqa %xmm2,%xmm7 4011eb: 39 d9 cmp %ebx,%ecx 4011ed: 66 0f 38 00 f8 pshufb %xmm0,%xmm7 4011f2: 66 0f fc f7 paddb %xmm7,%xmm6 4011f6: 66 0f ef ff pxor %xmm7,%xmm7 4011fa: 66 0f f6 f7 psadbw %xmm7,%xmm6 4011fe: 0f 84 be 00 00 00 je 4012c2 <__Z8popcountPKU8__vectorxjj+0x172> The second cmp is superfluous, as the SSE instructions in between do not modify CC. Are these known issues? Best regards Piotr Wyderski