On Fri, Mar 12, 2021 at 06:48:57PM +0100, Uros Bizjak wrote:
> It is hidden in *vec_extractv4si pattern:
> 
>  (define_insn "*vec_extractv4si"
> -  [(set (match_operand:SI 0 "nonimmediate_operand" "=rm,rm,Yr,*x,x,Yv")
> +  [(set (match_operand:SI 0 "nonimmediate_operand" "=rm,rm,Yr,*x,Yw")
>      (vec_select:SI
> -      (match_operand:V4SI 1 "register_operand" "x,v,0,0,x,v")
> +      (match_operand:V4SI 1 "register_operand" "  x, v, 0, 0,Yw")
>        (parallel [(match_operand:SI 2 "const_0_to_3_operand")])))]
>    "TARGET_SSE4_1"
>  {
> @@ -15668,7 +15660,6 @@
>        return "psrldq\t{%2, %0|%0, %2}";
> 
>      case 4:
> -    case 5:
>        operands[2] = GEN_INT (INTVAL (operands[2]) * 4);
>        return "vpsrldq\t{%2, %1, %0|%0, %1, %2}";

Ah, ok, thanks.

> > and I believe I had exactly this change in an earlier version of my patch
> > and it didn't work (broke
> > +FAIL: gcc.target/i386/avx512vl-vpalignr-4.c scan-assembler-not 
> > vpalignr[^\\n\\r]*\\\\\$8[^\\n\\r]*%xmm16[^\\n\\r]*%xmm16[^\\n\\r]*%xmm16
> > ), which is why I've reverted it.
> > It could use YW instead of Yw though and then it should work.
> 
> My copy of x86 ISA says that vpalignr with xmm operands needs AVX512VL
> and AVX512BW. Is the testcase correct?
> 
> [uros@localhost test]$ cat palignr.s
>        vpalignr $2, %xmm22, %xmm23, %xmm24
> [uros@localhost test]$ as -march=+noavx512vl palignr.s
> palignr.s: Assembler messages:
> palignr.s:1: Error: unsupported instruction `vpalignr'
> [uros@localhost test]$ as -march=+noavx512bw palignr.s
> palignr.s: Assembler messages:
> palignr.s:1: Error: unsupported instruction `vpalignr'

That is indeed the case, but often AVX512VL is implied
already from the operand mode.
The pattern uses V_128 iterator, so:
  [V16QI V8HI V4SI V2DI V4SF (V2DF "TARGET_SSE2")])
For all these modes, ix86_hard_regno_mode_ok
will return false for %xmm16+ when -mavx512vl is not true.

Anyway, sorry, I was wrong, I actually had multiple versions
of the patch and this pattern I've changed more than once,
first used the Yw in there and then after seeing bugs in other
patterns changed that to <v_Yw> and that was the patch that
actually failed the gcc.target/i386/avx512vl-vpalignr-4.c test
and I've just removed those hunks and rebootstrapped/retested
afterwards.
<v_Yw> instead of Yw was needed for numerous patterns, usually
when they include both 16-byte, 32-byte and 64-byte vectors
in the same define_insn*, so that the 64-byte which typically
had TARGET_AVX512BW in the iterator's condition uses v and doesn't need
AVX512VL.
But is not a good fit for *ssse3_palignr<mode>_perm,
because that pattern uses various non-QI/HI modes for which
it still wants Yw/YW.

So, your patch looks good to me.

I can test it on avx512{bw,vl,dq} hw tonight if you want.

        Jakub

Reply via email to