Following code should produce a cvtps2pd and cvtpi2pd instructions that operate
on vectors:
void test_fp (float *a, double *b)
{
int i;
for (i = 0; i < 4; i++)
b[i] = (double) a[i];
}
void test_int (int *a, double *b)
{
int i;
for (i = 0; i < 4; i++)
b[i] = (double) a[i];
}
Currently, gcc produces scalar instructions
(gcc -O2 -march=pentium4 -mfpmath=sse -ftree-vectorize):
.L2:
movss -4(%ecx,%eax,4), %xmm0
cvtss2sd %xmm0, %xmm0
movsd %xmm0, -8(%edx,%eax,8)
addl $1, %eax
cmpl $5, %eax
jne .L2
and
.L9:
cvtsi2sd -4(%ecx,%eax,4), %xmm0
movsd %xmm0, -8(%edx,%eax,8)
addl $1, %eax
cmpl $5, %eax
jne .L9
(BTW: There is also one movss too many in the first example.)
--
Summary: Conversions are not vectorized
Product: gcc
Version: 4.1.0
Status: UNCONFIRMED
Severity: enhancement
Priority: P3
Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: uros at kss-loka dot si
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24659