On Fri, Mar 12, 2021 at 06:48:57PM +0100, Uros Bizjak wrote: > It is hidden in *vec_extractv4si pattern: > > (define_insn "*vec_extractv4si" > - [(set (match_operand:SI 0 "nonimmediate_operand" "=rm,rm,Yr,*x,x,Yv") > + [(set (match_operand:SI 0 "nonimmediate_operand" "=rm,rm,Yr,*x,Yw") > (vec_select:SI > - (match_operand:V4SI 1 "register_operand" "x,v,0,0,x,v") > + (match_operand:V4SI 1 "register_operand" " x, v, 0, 0,Yw") > (parallel [(match_operand:SI 2 "const_0_to_3_operand")])))] > "TARGET_SSE4_1" > { > @@ -15668,7 +15660,6 @@ > return "psrldq\t{%2, %0|%0, %2}"; > > case 4: > - case 5: > operands[2] = GEN_INT (INTVAL (operands[2]) * 4); > return "vpsrldq\t{%2, %1, %0|%0, %1, %2}";
Ah, ok, thanks. > > and I believe I had exactly this change in an earlier version of my patch > > and it didn't work (broke > > +FAIL: gcc.target/i386/avx512vl-vpalignr-4.c scan-assembler-not > > vpalignr[^\\n\\r]*\\\\\$8[^\\n\\r]*%xmm16[^\\n\\r]*%xmm16[^\\n\\r]*%xmm16 > > ), which is why I've reverted it. > > It could use YW instead of Yw though and then it should work. > > My copy of x86 ISA says that vpalignr with xmm operands needs AVX512VL > and AVX512BW. Is the testcase correct? > > [uros@localhost test]$ cat palignr.s > vpalignr $2, %xmm22, %xmm23, %xmm24 > [uros@localhost test]$ as -march=+noavx512vl palignr.s > palignr.s: Assembler messages: > palignr.s:1: Error: unsupported instruction `vpalignr' > [uros@localhost test]$ as -march=+noavx512bw palignr.s > palignr.s: Assembler messages: > palignr.s:1: Error: unsupported instruction `vpalignr' That is indeed the case, but often AVX512VL is implied already from the operand mode. The pattern uses V_128 iterator, so: [V16QI V8HI V4SI V2DI V4SF (V2DF "TARGET_SSE2")]) For all these modes, ix86_hard_regno_mode_ok will return false for %xmm16+ when -mavx512vl is not true. Anyway, sorry, I was wrong, I actually had multiple versions of the patch and this pattern I've changed more than once, first used the Yw in there and then after seeing bugs in other patterns changed that to <v_Yw> and that was the patch that actually failed the gcc.target/i386/avx512vl-vpalignr-4.c test and I've just removed those hunks and rebootstrapped/retested afterwards. <v_Yw> instead of Yw was needed for numerous patterns, usually when they include both 16-byte, 32-byte and 64-byte vectors in the same define_insn*, so that the 64-byte which typically had TARGET_AVX512BW in the iterator's condition uses v and doesn't need AVX512VL. But is not a good fit for *ssse3_palignr<mode>_perm, because that pattern uses various non-QI/HI modes for which it still wants Yw/YW. So, your patch looks good to me. I can test it on avx512{bw,vl,dq} hw tonight if you want. Jakub