On Fri, Sep 23, 2022 at 02:42:54PM +0800, liuhongt via Gcc-patches wrote:
> 2022-09-23  Hongtao Liu  <hongtao....@intel.com>
>           Liwei Xu  <liwei...@intel.com>
> 
> gcc/ChangeLog:
> 
>       PR target/53346
>       * config/i386/i386-expand.cc (expand_vec_perm_shufps_shufps):
>       New function.
>       (ix86_expand_vec_perm_const_1): Insert
>       expand_vec_perm_shufps_shufps at the end of 2-instruction
>       expand sequence.
> 
> gcc/testsuite/ChangeLog:
> 
>       * gcc.target/i386/pr53346-1.c: New test.
>       * gcc.target/i386/pr53346-2.c: New test.
> ---
>  gcc/config/i386/i386-expand.cc            | 117 ++++++++++++++++++++++
>  gcc/testsuite/gcc.target/i386/pr53346-1.c |  70 +++++++++++++
>  gcc/testsuite/gcc.target/i386/pr53346-2.c |  59 +++++++++++
>  gcc/testsuite/gcc.target/i386/pr53346-3.c |  69 +++++++++++++
>  gcc/testsuite/gcc.target/i386/pr53346-4.c |  59 +++++++++++
>  5 files changed, 374 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr53346-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr53346-2.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr53346-3.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr53346-4.c
> 
> diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
> index 5334363e235..43c58111a62 100644
> --- a/gcc/config/i386/i386-expand.cc
> +++ b/gcc/config/i386/i386-expand.cc
> @@ -19604,6 +19604,120 @@ expand_vec_perm_1 (struct expand_vec_perm_d *d)
>    return false;
>  }
>  
> +/* A subroutine of ix86_expand_vec_perm_const_1. Try to implement D
> +   in terms of a pair of shufps+ shufps/pshufd instructions. */
> +static bool
> +expand_vec_perm_shufps_shufps (struct expand_vec_perm_d *d)
> +{
> +  unsigned char perm1[4];
> +  machine_mode vmode = d->vmode;
> +  bool ok;
> +  unsigned i, j, k, count = 0;
> +
> +  if (d->one_operand_p
> +      || (vmode != V4SImode && vmode != V4SFmode))
> +    return false;
> +
> +  if (d->testing_p)
> +    return true;
> +
> +  for (i = 0; i < 4; ++i)
> +    count += d->perm[i] > 3 ? 1 : 0;
> +
> +  gcc_assert(count & 3);

Missing space before (
> +      /* shufps.  */
> +      ok = expand_vselect_vconcat(tmp, d->op0, d->op1,
> +                               perm1, d->nelt, false);

Ditto.

> +      /* When lone_idx is not 0, it must from second op(count == 1).  */
> +      gcc_assert ((lone_idx == 0 && count == 3)
> +               || (lone_idx != 0 && count == 1));

Perhaps write it more simply as
      gcc_assert (count == (lone_idx ? 1 : 3));
?

> +      /* shufps.  */
> +      ok = expand_vselect_vconcat(tmp, d->op0, d->op1,
> +                               perm1, d->nelt, false);

Missing space before (

> +      gcc_assert (ok);
> +
> +      /* Refine lone and pair index to original order.  */
> +      perm1[shift] = lone_idx << 1;
> +      perm1[shift + 1] = pair_idx << 1;
> +
> +      /* Select the remaining 2 elements in another vector.  */
> +      for (i = 2 - shift; i < 4 - shift; ++i)
> +     perm1[i] = (lone_idx == 1) ? (d->perm[i] + 4) : d->perm[i];

All the ()s in the above line aren't needed.

> +      /* shufps.  */
> +      ok = expand_vselect_vconcat(d->target, tmp, d->op1,
> +                               perm1, d->nelt, false);

Again, missing space

Otherwise LGTM

        Jakub

Reply via email to