On 32-bit x86 I compiled this test case with -msse2 -mmmx:
#include <mmintrin.h>
__m64 foo (int *p)
{
__m64 m1 = _mm_cvtsi32_si64 (*p++);
__m64 m2 = _mm_cvtsi32_si64 (*p);
return _mm_unpacklo_pi8 (m1, m2);
}
Note that all operations are MMX functions defined in mmintrin.h. The
resulting assembler code includes the following:
movd (%eax), %xmm0
movq %xmm0, -8(%ebp)
movd 4(%eax), %xmm0
movq -8(%ebp), %mm0
movq %xmm0, -16(%ebp)
punpcklbw -16(%ebp), %mm0
If I take off -msse2, I get this:
movd (%eax), %mm0
movd 4(%eax), %mm1
punpcklbw %mm1, %mm0
That is not optimal--the 4(%eax) can appear directly in the punpcklbw
instruction--but it is much better than the first result. The mere presence of
-msse2 should not cause SSE registers to be used when only MMX operations are
requested.
--
Summary: Dumb use of SSE regs for MMX operation
Product: gcc
Version: 4.6.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: ian at airs dot com
GCC target triplet: i686-pc-linux-gnu
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43743