On Tue, Oct 25, 2011 at 10:21 AM, Jakub Jelinek <ja...@redhat.com> wrote:
> For -masm=intel we currently emit invalid assembler for the v*gather* > insns, > vpgatherdd ymm0, (rax, ymm1, 1), ymm2 > is some weird mixture of AT&T and Intel syntax. Furthermore, > by requiring a register as a base we unnecessarily penalize the code, > even when the base is constant or symbol, we force it into register, and > when it is a register + displacement, it is again forced into a register. > The following patch changes it so that we have a MEM with scalarssemode > (for -masm=intel to print the right DWORD PTR or QWORD PTR) with > UNSPEC_VSIBADDR as address. This UNSPEC contains the needed triplet > (base register/displacement/register+displacement, xmmN/ymmN register > and scale) and ix86_print_operand_address is taught to print it. > With this we emit e.g. > vpgatherdd ymm0, DWORD PTR base[16+ymm1*1], ymm2 > or > vpgatherdd %ymm2, base+16(,%ymm1,1), %ymm0 > Apparently VSIB addressing doesn't allow (%rip), so I had to reject certain > UNSPECs - those need to go into a lea. > Testcases have been adjusted so that they work with -masm=intel too and > additionally don't match just one arbitrary [xy]mm register type out of the > 3, but all 3 and thus checks whether the right combination of > %xmmN.*%xmmO.*%xmmP, %ymmN.*%ymmO.*%ymmP, %xmmN.*%ymmO.*%xmmP > resp. %ymmN.*%xmmO.*%ymmP is used. > > I've noticed at least my version of binutils (a couple of weeks old) doesn't > want to grok DWORD PTR base[16+ymm1] for scale 1, needs ymm1*1, so I'm > forcing the output of *1 for the VSIB addressing mode (for AT&T as groks it > even without it, I've forced the ,1 there anyway, but can remove it if > requested). > > Regtested with {-m32,-m64} {,-fpic} {,-masm=intel} on i386.exp=*gather*.c > and additionally tried to assemble all of them in all those modes. > > Ok for trunk? > > 2011-10-25 Jakub Jelinek <ja...@redhat.com> > > * config/i386/i386.md (UNSPEC_VSIBADDR): New. > * config/i386/predicates.md (vsib_address_operand, > vsib_mem_operator): New predicates. > * config/i386/i386.c (ix86_print_operand_address): Handle > UNSPEC_VSIBADDR addresses. > * config/i386/sse.md (avx2_gathersi<mode>, avx2_gatherdi<mode>, > avx2_gatherdi<mode>256): Adjust expanders to use MEM with > UNSPEC_VSIBADDR address. > (*avx2_gathersi<mode>, *avx2_gatherdi<mode>, *avx2_gatherdi<mode>256): > Adjust insns to use MEM with UNSPEC_VSIBADDR address. > > * gcc.target/i386/avx2-i32gatherd-1.c: Adjust scan-assembler regex > to work also with -masm=intel and additionally test the xmm vs. ymm > register type combination on mask/dest and in vsib. > * gcc.target/i386/avx2-i32gatherd256-1.c: Likewise. > * gcc.target/i386/avx2-i32gatherd256-3.c: Likewise. > * gcc.target/i386/avx2-i32gatherd-3.c: Likewise. > * gcc.target/i386/avx2-i32gatherpd-1.c: Likewise. > * gcc.target/i386/avx2-i32gatherpd256-1.c: Likewise. > * gcc.target/i386/avx2-i32gatherpd256-3.c: Likewise. > * gcc.target/i386/avx2-i32gatherpd-3.c: Likewise. > * gcc.target/i386/avx2-i32gatherps-1.c: Likewise. > * gcc.target/i386/avx2-i32gatherps256-1.c: Likewise. > * gcc.target/i386/avx2-i32gatherps256-3.c: Likewise. > * gcc.target/i386/avx2-i32gatherps-3.c: Likewise. > * gcc.target/i386/avx2-i32gatherq-1.c: Likewise. > * gcc.target/i386/avx2-i32gatherq256-1.c: Likewise. > * gcc.target/i386/avx2-i32gatherq256-3.c: Likewise. > * gcc.target/i386/avx2-i32gatherq-3.c: Likewise. > * gcc.target/i386/avx2-i64gatherd-1.c: Likewise. > * gcc.target/i386/avx2-i64gatherd256-1.c: Likewise. > * gcc.target/i386/avx2-i64gatherd256-3.c: Likewise. > * gcc.target/i386/avx2-i64gatherd-3.c: Likewise. > * gcc.target/i386/avx2-i64gatherpd-1.c: Likewise. > * gcc.target/i386/avx2-i64gatherpd256-1.c: Likewise. > * gcc.target/i386/avx2-i64gatherpd256-3.c: Likewise. > * gcc.target/i386/avx2-i64gatherpd-3.c: Likewise. > * gcc.target/i386/avx2-i64gatherps-1.c: Likewise. > * gcc.target/i386/avx2-i64gatherps256-1.c: Likewise. > * gcc.target/i386/avx2-i64gatherps256-3.c: Likewise. > * gcc.target/i386/avx2-i64gatherps-3.c: Likewise. > * gcc.target/i386/avx2-i64gatherq-1.c: Likewise. > * gcc.target/i386/avx2-i64gatherq256-1.c: Likewise. > * gcc.target/i386/avx2-i64gatherq256-3.c: Likewise. > * gcc.target/i386/avx2-i64gatherq-3.c: Likewise. OK. Thanks, Uros.