On Mon, Oct 30, 2023 at 1:23 PM <pan2...@intel.com> wrote:
>
> From: Pan Li <pan2...@intel.com>
>
> Update in v3:
>
> * Add func to predicate type size is legal or not for vectorizer call.
>
> Update in v2:
>
> * Fix one ICE of type assertion.
> * Adjust some test cases for aarch64 sve and riscv vector.
>
> Original log:
>
> The vectoriable_call has one restriction of the size of data type.
> Aka DF to DI is allowed but SF to DI isn't. You may see below message
> when try to vectorize function call like lrintf.
>
> void
> test_lrintf (long *out, float *in, unsigned count)
> {
>   for (unsigned i = 0; i < count; i++)
>     out[i] = __builtin_lrintf (in[i]);
> }
>
> lrintf.c:5:26: missed: couldn't vectorize loop
> lrintf.c:5:26: missed: not vectorized: unsupported data-type
>
> Then the standard name pattern like lrintmn2 cannot work for different
> data type size like SF => DI. This patch would like to refine this data
> type size check and unblock the standard name like lrintmn2 on conditions.
>
> The type size of vectype_out need to be exactly the same as the type
> size of vectype_in when the vectype_out size isn't participating in
> the optab selection. While there is no such restriction when the
> vectype_out is somehow a part of the optab query.
>
> The below test are passed for this patch.
>
> * The x86 bootstrap and regression test.
> * The aarch64 regression test.
> * The risc-v regression tests.
> * Ensure the lrintf standard name in risc-v.
>
> gcc/ChangeLog:
>
>         * tree-vect-stmts.cc (vectorizable_type_size_legal_p): New
>         func impl to predicate the type size is legal or not.
>         (vectorizable_call): Leverage vectorizable_type_size_legal_p.
>
> Signed-off-by: Pan Li <pan2...@intel.com>
> ---
>  gcc/tree-vect-stmts.cc | 51 +++++++++++++++++++++++++++++++-----------
>  1 file changed, 38 insertions(+), 13 deletions(-)
>
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index a9200767f67..24b3448d961 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -1430,6 +1430,35 @@ vectorizable_internal_function (combined_fn cfn, tree 
> fndecl,
>    return IFN_LAST;
>  }
>
> +/* Return TRUE when the type size is legal for the call vectorizer,
> +   or FALSE.
> +   The type size of both the vectype_in and vectype_out should be
> +   exactly the same when vectype_out isn't participating the optab.
> +   While there is no restriction for type size when vectype_out
> +   is part of the optab query.
> + */
> +static bool
> +vectorizable_type_size_legal_p (internal_fn ifn, tree vectype_out,
> +                               tree vectype_in)
> +{
> +  bool same_size_p = TYPE_SIZE (vectype_in) == TYPE_SIZE (vectype_out);
> +
> +  if (ifn == IFN_LAST || !direct_internal_fn_p (ifn))
> +    return same_size_p;
> +
> +  const direct_internal_fn_info &difn_info = direct_internal_fn (ifn);
> +
> +  if (!difn_info.vectorizable)
> +    return same_size_p;
> +
> +  /* According to vectorizable_internal_function, the type0/1 < 0 indicates
> +     the vectype_out participating the optable selection.  Aka the type size
> +     check can be skipped here.  */
> +  if (difn_info.type0 < 0 || difn_info.type1 < 0)
> +    return true;

can you instead amend vectorizable_internal_function to contain the check,
returning IFN_LAST if it doesn't hold?

> +
> +  return same_size_p;
> +}
>
>  static tree permute_vec_elements (vec_info *, tree, tree, tree, 
> stmt_vec_info,
>                                   gimple_stmt_iterator *);
> @@ -3361,19 +3390,6 @@ vectorizable_call (vec_info *vinfo,
>
>        return false;
>      }
> -  /* FORNOW: we don't yet support mixtures of vector sizes for calls,
> -     just mixtures of nunits.  E.g. DI->SI versions of __builtin_ctz*
> -     are traditionally vectorized as two VnDI->VnDI IFN_CTZs followed
> -     by a pack of the two vectors into an SI vector.  We would need
> -     separate code to handle direct VnDI->VnSI IFN_CTZs.  */
> -  if (TYPE_SIZE (vectype_in) != TYPE_SIZE (vectype_out))
> -    {
> -      if (dump_enabled_p ())
> -       dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> -                        "mismatched vector sizes %T and %T\n",
> -                        vectype_in, vectype_out);
> -      return false;
> -    }
>
>    if (VECTOR_BOOLEAN_TYPE_P (vectype_out)
>        != VECTOR_BOOLEAN_TYPE_P (vectype_in))
> @@ -3431,6 +3447,15 @@ vectorizable_call (vec_info *vinfo,
>      ifn = vectorizable_internal_function (cfn, callee, vectype_out,
>                                           vectype_in);
>
> +  if (!vectorizable_type_size_legal_p (ifn, vectype_out, vectype_in))
> +    {
> +      if (dump_enabled_p ())
> +       dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> +                        "mismatched vector sizes %T and %T\n",
> +                        vectype_in, vectype_out);
> +      return false;
> +    }
> +
>    /* If that fails, try asking for a target-specific built-in function.  */
>    if (ifn == IFN_LAST)
>      {
> --
> 2.34.1
>

Reply via email to