On Mar 22, 2007, at 11:43 AM, Bill Wendling wrote: > +We should compile this: > + > +#include <xmmintrin.h> > + > +void foo(__m128i *A, __m128i *B) { > + *A = _mm_sll_epi16 (*A, *B); > +} > + > +to: > + > +_foo: > + subl $12, %esp > + movl 16(%esp), %edx > + movl 20(%esp), %eax > + movdqa (%edx), %xmm1 > + movdqa (%eax), %xmm0 > + psllw %xmm0, %xmm1 > + movdqa %xmm1, (%edx) > + addl $12, %esp > + ret > + > +not: > + > +_foo: > + movl 8(%esp), %eax > + movdqa (%eax), %xmm0 > + #IMPLICIT_DEF %eax > + pinsrw $2, %eax, %xmm0 > + xorl %ecx, %ecx > + pinsrw $3, %ecx, %xmm0 > + pinsrw $4, %eax, %xmm0 > + pinsrw $5, %ecx, %xmm0 > + pinsrw $6, %eax, %xmm0 > + pinsrw $7, %ecx, %xmm0 > + movl 4(%esp), %eax > + movdqa (%eax), %xmm1 > + psllw %xmm0, %xmm1 > + movdqa %xmm1, (%eax) > + ret
This looks like a *serious* SSE performance bug, not a missing feature. Bill, can you look into this? Evan, can you help him if needed? Thanks, -Chris _______________________________________________ llvm-commits mailing list llvm-commits@cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits