On Thu, Apr 14, 2011 at 9:11 PM, Jakub Jelinek <ja...@redhat.com> wrote:
> The insertps and vinsertps insns work differently if %2 is a register > resp. memory. The register is 128-bit XMM register and the upper 2 bits > of the 8 bit immediate then select which of the 4 parts of that register to > choose. If it is a memory though, the upper 2 bits of the 8 bit immediate > are ignored and the memory is 32-bit rather than 128-bit. > The following patch fixes two problems - with -masm=intel any > sse4_1_insertps emitted insn when operands[2] is a MEM wouldn't assemble, > as it was using XMMWORD instead of DWORD. And the second problem is > that if the top 2 bits are non-zero, the address needs to be adjusted > (and there is no reason not to clear the upper two bits of the immediate > at the same time). > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/4.6? > > 2011-04-14 Jakub Jelinek <ja...@redhat.com> > > PR target/48605 > * config/i386/sse.md (sse4_1_insertps): If operands[2] is a MEM, > offset it as needed based on top 2 bits in operands[3], change > MEM mode to SFmode and mask those 2 bits away from operands[3]. > > * gcc.target/i386/sse4_1-insertps-3.c: New test. > * gcc.target/i386/sse4_1-insertps-4.c: New test. > * gcc.target/i386/avx-insertps-3.c: New test. > * gcc.target/i386/avx-insertps-4.c: New test. OK. Thanks, Uros.