On Mar 22, 2007, at 11:43 AM, Bill Wendling wrote:

> +We should compile this:
> +
> +#include <xmmintrin.h>
> +
> +void foo(__m128i *A, __m128i *B) {
> +  *A = _mm_sll_epi16 (*A, *B);
> +}
> +
> +to:
> +
> +_foo:
> +     subl    $12, %esp
> +     movl    16(%esp), %edx
> +     movl    20(%esp), %eax
> +     movdqa  (%edx), %xmm1
> +     movdqa  (%eax), %xmm0
> +     psllw   %xmm0, %xmm1
> +     movdqa  %xmm1, (%edx)
> +     addl    $12, %esp
> +     ret
> +
> +not:
> +
> +_foo:
> +     movl 8(%esp), %eax
> +     movdqa (%eax), %xmm0
> +     #IMPLICIT_DEF %eax
> +     pinsrw $2, %eax, %xmm0
> +     xorl %ecx, %ecx
> +     pinsrw $3, %ecx, %xmm0
> +     pinsrw $4, %eax, %xmm0
> +     pinsrw $5, %ecx, %xmm0
> +     pinsrw $6, %eax, %xmm0
> +     pinsrw $7, %ecx, %xmm0
> +     movl 4(%esp), %eax
> +     movdqa (%eax), %xmm1
> +     psllw %xmm0, %xmm1
> +     movdqa %xmm1, (%eax)
> +     ret

This looks like a *serious* SSE performance bug, not a missing  
feature.  Bill, can you look into this?  Evan, can you help him if  
needed?

Thanks,

-Chris

_______________________________________________
llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

Reply via email to