On Thu, Oct 30, 2014 at 3:55 PM, Ilya Tocar <tocarip.in...@gmail.com> wrote: > Hi, > > I've run gcc.dg/torture/* tests with -mavx512bw -mavx512vl -mavx512dq > flags, and got a bunch of fails (mostly in permutes autogen). > Patch below fixes them. > Ok for trunk? > > 2014-10-30 Ilya Tocar <ilya.to...@intel.com> > > * config/i386/i386.c (expand_vec_perm_pshufb): Try vpermq/vpermd > for 512-bit wide modes. > (expand_vec_perm_1): Use correct versions of patterns. > * config/i386/sse.md (avx512f_vec_dup_<mode>_1): New. > (vashr<mode>3<mask_name>): Split V8HImode and V16QImode.
Please name new patterns ..._vec_dup<mode>... , without space between vec_dup and <mode>. > --- > gcc/config/i386/i386.c | 59 > ++++++++++++++++++++++++++++++++++++++++++++------ > gcc/config/i386/sse.md | 54 ++++++++++++++++++++++++++++++++++++++------- > 2 files changed, 98 insertions(+), 15 deletions(-) > > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c > index 71a4f6a..74ff894 100644 > --- a/gcc/config/i386/i386.c > +++ b/gcc/config/i386/i386.c > @@ -45889,6 +45889,42 @@ expand_vec_perm_pshufb (struct expand_vec_perm_d *d) > { > if (!TARGET_AVX512BW) > return false; > + > + /* If vpermq didn't work, vpshufb won't work either. */ > + if (d->vmode == V8DFmode || d->vmode == V8DImode) > + return false; > + > + vmode = V64QImode; > + if (d->vmode == V16SImode > + || d->vmode == V32HImode > + || d->vmode == V64QImode) > + { > + /* First see if vpermq can be used for > + V16SImode/V32HImode/V64QImode. */ > + if (valid_perm_using_mode_p (V8DImode, d)) > + { > + for (i = 0; i < 8; i++) > + perm[i] = (d->perm[i * nelt / 8] * 8 / nelt) & 7; > + if (d->testing_p) > + return true; > + target = gen_reg_rtx (V8DImode); > + if (expand_vselect (target, gen_lowpart (V8DImode, d->op0), > + perm, 8, false)) > + { > + emit_move_insn (d->target, > + gen_lowpart (d->vmode, target)); > + return true; > + } > + return false; > + } > + > + /* Next see if vpermd can be used. */ > + if (valid_perm_using_mode_p (V16SImode, d)) > + vmode = V16SImode; > + } > + /* Or if vpermps can be used. */ > + else if (d->vmode == V16SFmode) > + vmode = V16SImode; > if (vmode == V64QImode) > { > /* vpshufb only works intra lanes, it is not > @@ -45908,6 +45944,9 @@ expand_vec_perm_pshufb (struct expand_vec_perm_d *d) > if (vmode == V8SImode) > for (i = 0; i < 8; ++i) > rperm[i] = GEN_INT ((d->perm[i * nelt / 8] * 8 / nelt) & 7); > + else if (vmode == V16SImode) > + for (i = 0; i < 16; ++i) > + rperm[i] = GEN_INT ((d->perm[i * nelt / 16] * 16 / nelt) & 15); > else > { > eltsz = GET_MODE_SIZE (GET_MODE_INNER (d->vmode)); I'd like to ask Jakub for a review of the above two parts, other parts are OK with a rename (as mentioned above). Uros.