https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99754

            Bug ID: 99754
           Summary: [sse2] new _mm_loadu_si16 and _mm_loadu_si32
                    implemented incorrectly
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: e...@coeus-group.com
  Target Milestone: ---

Created attachment 50470
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50470&action=edit
Trivial patch

_mm_loadu_si16 and _mm_loadu_si32 were implemented in GCC 11, but incorrectly. 
The value pointed to by the argument is supposed to go in the first element,
but _mm_set_epi16 / _mm_set_epi32 reverse the argument order so in GCC they go
in the *last* elemement.

The most straightforward solution would be to change the _mm_set_* calls so the
input is used for the last argument instead of the first (patch attached).

FWIW, here is LLVM's implementation:
<https://github.com/llvm/llvm-project/blob/a76d0207d5f94af698525d7dc1f0953ed35901a6/clang/lib/Headers/emmintrin.h#L1670-L1710>.
I've verified that LLVM's implementation matches ICC's.

Reply via email to