On Tue, May 5, 2020 at 9:11 AM Jakub Jelinek <ja...@redhat.com> wrote: > > Hi! > > This insn and split splits into HI->V?HImode broadcast for avx2 and later, > but either the operands need to be %xmm0-%xmm15 (i.e. VEX encoded insn), or > the insn needs both AVX512BW and AVX512VL. > Now, Yv constraint is v for AVX512VL and x otherwise, so for -mavx512vl > -mno-avx512bw > we ICE if we end up with a %xmm16+ register from RA. > Yw constraint is v for AVX512VL and AVX512BW and nothing otherwise, so > in this pattern we actually need xYw. > > Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for > trunk? > > 2020-05-05 Jakub Jelinek <ja...@redhat.com> > > PR target/94942 > * config/i386/mmx.md (*vec_dupv4hi): Use xYw constraints instead of > Yv. > > * gcc.target/i386/pr94942.c: New test.
OK. Thanks, Uros. > --- gcc/config/i386/mmx.md.jj 2020-03-12 14:28:13.343364091 +0100 > +++ gcc/config/i386/mmx.md 2020-05-04 13:48:14.946723617 +0200 > @@ -1613,10 +1613,10 @@ (define_insn "mmx_pswapdv2si2" > (set_attr "mode" "DI")]) > > (define_insn_and_split "*vec_dupv4hi" > - [(set (match_operand:V4HI 0 "register_operand" "=y,Yv,Yw") > + [(set (match_operand:V4HI 0 "register_operand" "=y,xYw,Yw") > (vec_duplicate:V4HI > (truncate:HI > - (match_operand:SI 1 "register_operand" "0,Yv,r"))))] > + (match_operand:SI 1 "register_operand" "0,xYw,r"))))] > "(TARGET_MMX || TARGET_MMX_WITH_SSE) > && (TARGET_SSE || TARGET_3DNOW_A)" > "@ > --- gcc/testsuite/gcc.target/i386/pr94942.c.jj 2020-05-04 13:51:56.512495800 > +0200 > +++ gcc/testsuite/gcc.target/i386/pr94942.c 2020-05-04 13:52:43.926805052 > +0200 > @@ -0,0 +1,24 @@ > +/* PR target/94942 */ > +/* { dg-do compile } */ > +/* { dg-options "-O -flive-range-shrinkage -ftree-vrp -mavx512vl > -mno-avx512bw -Wno-div-by-zero" } */ > + > +typedef unsigned __attribute__((__vector_size__(8))) U; > +typedef short __attribute__((__vector_size__(8))) V; > +typedef char __attribute__((__vector_size__(16))) W; > +typedef int __attribute__((__vector_size__(16))) Z; > +int i, j, n, o; > +W k; > +Z l; > +char m; > + > +U > +foo (U q, long long r, V s) > +{ > + Z t = (i & i - (Z){10} & 4) - (0 != j); > + Z u = o * (j * l); > + s -= (char)__builtin_clrsbll (n); > + W v = (k | k >> m + (W){4}) % 0; > + W w = v + (W)t + (W)u; > + U x = ((union { W a; U b; })w).b + q + (U)s + (U)r; > + return x; > +} > > Jakub >