http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48605

           Summary: gcc.target/i386/sse4_1-insertps-2.c FAILs with
                    -mtune=geode - instruction insertps with memory
                    operands behaves differently
           Product: gcc
           Version: 4.7.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassig...@gcc.gnu.org
        ReportedBy: zso...@seznam.cz
              Host: x86_64-pc-linux-gnu
            Target: i686-pc-linux-gnu


Created attachment 23978
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=23978
reduced testcase

Output:
$ gcc -m32 -msse4.1 -O testcase.c -mtune=geode
$ ./a.out 
Aborted

When comparing the asm output with/out -mtune=geode, it looks very similiar.
But there seems to be a different behaviour of "insertps" when the source
operand is a memory location. In that case, the "COUNT_S" part is ignored, and
the offset has to be encoded in the "address of memory operand" part of the
instruction.

Specifically:
insertps    xmm1, XMMWORD PTR [esp+64], 78    # tmp117, val.x,
behaves the same as:
insertps    xmm1, XMMWORD PTR [esp+64], 14    # tmp114, val.x,
and instead should be used:
insertps    xmm1, XMMWORD PTR [esp+68], 14

This is what Intel's docs say as well:
IF (SRC = REG) THEN COUNT_S  imm8[7:6]
ELSE COUNT_S  0

Reply via email to