On 32-bit x86 I compiled this test case with -msse2 -mmmx: #include <mmintrin.h> __m64 foo (int *p) { __m64 m1 = _mm_cvtsi32_si64 (*p++); __m64 m2 = _mm_cvtsi32_si64 (*p); return _mm_unpacklo_pi8 (m1, m2); }
Note that all operations are MMX functions defined in mmintrin.h. The resulting assembler code includes the following: movd (%eax), %xmm0 movq %xmm0, -8(%ebp) movd 4(%eax), %xmm0 movq -8(%ebp), %mm0 movq %xmm0, -16(%ebp) punpcklbw -16(%ebp), %mm0 If I take off -msse2, I get this: movd (%eax), %mm0 movd 4(%eax), %mm1 punpcklbw %mm1, %mm0 That is not optimal--the 4(%eax) can appear directly in the punpcklbw instruction--but it is much better than the first result. The mere presence of -msse2 should not cause SSE registers to be used when only MMX operations are requested. -- Summary: Dumb use of SSE regs for MMX operation Product: gcc Version: 4.6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: ian at airs dot com GCC target triplet: i686-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43743