On Tue, Aug 3, 2021 at 2:10 PM Richard Sandiford via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> The issue-based vector costs currently assume that a multiply-add
> sequence can be implemented using a single instruction.  This is
> generally true for scalars (which have a 4-operand instruction)
> and SVE (which allows the output to be tied to any input).
> However, for Advanced SIMD, multiplying two values and adding
> an invariant will end up being a move and an MLA.
>
> The only target to use the issue-based vector costs is Neoverse V1,
> which would generally prefer SVE in this case anyway.  I therefore
> don't have a self-contained testcase.  However, the distinction
> becomes more important with a later patch.

But we do cost any invariants separately (for the prologue), so they
should be available in a register.  How doesn't that work?

> gcc/
>         * config/aarch64/aarch64.c (aarch64_multiply_add_p): Add a vec_flags
>         parameter.  Detect cases in which an Advanced SIMD MLA would almost
>         certainly require a MOV.
>         (aarch64_count_ops): Update accordingly.
> ---
>  gcc/config/aarch64/aarch64.c | 25 ++++++++++++++++++++++---
>  1 file changed, 22 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 084f8caa0da..19045ef6944 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -14767,9 +14767,12 @@ aarch64_integer_truncation_p (stmt_vec_info 
> stmt_info)
>
>  /* Return true if STMT_INFO is the second part of a two-statement 
> multiply-add
>     or multiply-subtract sequence that might be suitable for fusing into a
> -   single instruction.  */
> +   single instruction.  If VEC_FLAGS is zero, analyze the operation as
> +   a scalar one, otherwise analyze it as an operation on vectors with those
> +   VEC_* flags.  */
>  static bool
> -aarch64_multiply_add_p (vec_info *vinfo, stmt_vec_info stmt_info)
> +aarch64_multiply_add_p (vec_info *vinfo, stmt_vec_info stmt_info,
> +                       unsigned int vec_flags)
>  {
>    gassign *assign = dyn_cast<gassign *> (stmt_info->stmt);
>    if (!assign)
> @@ -14797,6 +14800,22 @@ aarch64_multiply_add_p (vec_info *vinfo, 
> stmt_vec_info stmt_info)
>        if (!rhs_assign || gimple_assign_rhs_code (rhs_assign) != MULT_EXPR)
>         continue;
>
> +      if (vec_flags & VEC_ADVSIMD)
> +       {
> +         /* Scalar and SVE code can tie the result to any FMLA input (or 
> none,
> +            although that requires a MOVPRFX for SVE).  However, Advanced 
> SIMD
> +            only supports MLA forms, so will require a move if the result
> +            cannot be tied to the accumulator.  The most important case in
> +            which this is true is when the accumulator input is invariant.  
> */
> +         rhs = gimple_op (assign, 3 - i);
> +         if (TREE_CODE (rhs) != SSA_NAME)
> +           return false;
> +         def_stmt_info = vinfo->lookup_def (rhs);
> +         if (!def_stmt_info
> +             || STMT_VINFO_DEF_TYPE (def_stmt_info) == vect_external_def)
> +           return false;
> +       }
> +
>        return true;
>      }
>    return false;
> @@ -15232,7 +15251,7 @@ aarch64_count_ops (class vec_info *vinfo, 
> aarch64_vector_costs *costs,
>      }
>
>    /* Assume that multiply-adds will become a single operation.  */
> -  if (stmt_info && aarch64_multiply_add_p (vinfo, stmt_info))
> +  if (stmt_info && aarch64_multiply_add_p (vinfo, stmt_info, vec_flags))
>      return;
>
>    /* When costing scalar statements in vector code, the count already

Reply via email to