When I compile this code :
#include <mmintrin.h>
__m64 moo(int i) {
__m64 tmp = _mm_cvtsi32_si64(i);
return tmp;
}
With (GCC) 4.0.0 20050116 like so:
gcc -O3 -S -mmmx moo.c
I get this (without the function pop/push etc)
movd 12(%ebp), %mm0
movq %mm0, (%eax)
However, if I use the -msse flag instead of -mmmx, I get this:
movd 12(%ebp), %mm0
movq %mm0, -8(%ebp)
movlps -8(%ebp), %xmm1
movlps %xmm1, (%eax)
gcc 3.4.2 does not display this behavior. I didn't get the chance to test it on
my Linux installation yet, but I'm pretty sure it's going to give the same
results.. I didn't use any special flags configuring or building gcc (just
../gcc-4.0-20050116/configure --enable-languages=c,c++ , and make bootstrap)
With -O0 flag instead of -O3, we see that it seems that gcc replaced some movq's
by movlps's (why??) and they do not get cancelled out during optimization..
I will attach the .i file generated by "gcc -O3 -S -msse moo.c".
I also tried a "direct conversion":
__m64 tmp = (__m64) (long long) i;
But I get a compiler error:
internal compiler error: in convert_move, at expr.c:367
--
Summary: MMX load intrinsic produces SSE superflus instructions
(movlps)
Product: gcc
Version: 4.0.0
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: regression
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: guardia at sympatico dot ca
CC: gcc-bugs at gcc dot gnu dot org
GCC build triplet: i686-pc-mingw32
GCC host triplet: i686-pc-mingw32
GCC target triplet: i686-pc-mingw32
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19530