V pet., 12. mar. 2021 19:19 je oseba Jakub Jelinek <ja...@redhat.com> napisala: > > On Fri, Mar 12, 2021 at 06:48:57PM +0100, Uros Bizjak wrote: > > It is hidden in *vec_extractv4si pattern: > > > > (define_insn "*vec_extractv4si" > > - [(set (match_operand:SI 0 "nonimmediate_operand" "=rm,rm,Yr,*x,x,Yv") > > + [(set (match_operand:SI 0 "nonimmediate_operand" "=rm,rm,Yr,*x,Yw") > > (vec_select:SI > > - (match_operand:V4SI 1 "register_operand" "x,v,0,0,x,v") > > + (match_operand:V4SI 1 "register_operand" " x, v, 0, 0,Yw") > > (parallel [(match_operand:SI 2 "const_0_to_3_operand")])))] > > "TARGET_SSE4_1" > > { > > @@ -15668,7 +15660,6 @@ > > return "psrldq\t{%2, %0|%0, %2}"; > > > > case 4: > > - case 5: > > operands[2] = GEN_INT (INTVAL (operands[2]) * 4); > > return "vpsrldq\t{%2, %1, %0|%0, %1, %2}"; > > Ah, ok, thanks. > > > > and I believe I had exactly this change in an earlier version of my patch > > > and it didn't work (broke > > > +FAIL: gcc.target/i386/avx512vl-vpalignr-4.c scan-assembler-not > > > vpalignr[^\\n\\r]*\\\\\$8[^\\n\\r]*%xmm16[^\\n\\r]*%xmm16[^\\n\\r]*%xmm16 > > > ), which is why I've reverted it. > > > It could use YW instead of Yw though and then it should work. > > > > My copy of x86 ISA says that vpalignr with xmm operands needs AVX512VL > > and AVX512BW. Is the testcase correct? > > > > [uros@localhost test]$ cat palignr.s > > vpalignr $2, %xmm22, %xmm23, %xmm24 > > [uros@localhost test]$ as -march=+noavx512vl palignr.s > > palignr.s: Assembler messages: > > palignr.s:1: Error: unsupported instruction `vpalignr' > > [uros@localhost test]$ as -march=+noavx512bw palignr.s > > palignr.s: Assembler messages: > > palignr.s:1: Error: unsupported instruction `vpalignr' > > That is indeed the case, but often AVX512VL is implied > already from the operand mode. > The pattern uses V_128 iterator, so: > [V16QI V8HI V4SI V2DI V4SF (V2DF "TARGET_SSE2")]) > For all these modes, ix86_hard_regno_mode_ok > will return false for %xmm16+ when -mavx512vl is not true. > > Anyway, sorry, I was wrong, I actually had multiple versions > of the patch and this pattern I've changed more than once, > first used the Yw in there and then after seeing bugs in other > patterns changed that to <v_Yw> and that was the patch that > actually failed the gcc.target/i386/avx512vl-vpalignr-4.c test > and I've just removed those hunks and rebootstrapped/retested > afterwards. > <v_Yw> instead of Yw was needed for numerous patterns, usually > when they include both 16-byte, 32-byte and 64-byte vectors > in the same define_insn*, so that the 64-byte which typically > had TARGET_AVX512BW in the iterator's condition uses v and doesn't need > AVX512VL. > But is not a good fit for *ssse3_palignr<mode>_perm, > because that pattern uses various non-QI/HI modes for which > it still wants Yw/YW. > > So, your patch looks good to me. > > I can test it on avx512{bw,vl,dq} hw tonight if you want.
I'm testing the patch on avx2 hw, which is not representative of this change. So if you can spare a few cycles, that would be awesome. Thanks, Uros.