On Tue, Nov 5, 2019 at 3:28 PM Richard Sandiford
<richard.sandif...@arm.com> wrote:
>
> vect_analyze_loop_costing uses two profitability thresholds: a runtime
> one and a static compile-time one.  The runtime one is simply the point
> at which the vector loop is cheaper than the scalar loop, while the
> static one also takes into account the cost of choosing between the
> scalar and vector loops at runtime.  We compare this static cost against
> the expected execution frequency to decide whether it's worth generating
> any vector code at all.
>
> However, we never reclaimed the cost of applying the runtime threshold
> if it turned out that the vector code can always be used.  And we only
> know whether that's true once we've calculated what the runtime
> threshold would be.

OK.

>
> 2019-11-04  Richard Sandiford  <richard.sandif...@arm.com>
>
> gcc/
>         * tree-vectorizer.h (vect_apply_runtime_profitability_check_p):
>         New function.
>         * tree-vect-loop-manip.c (vect_loop_versioning): Use it.
>         * tree-vect-loop.c (vect_analyze_loop_2): Likewise.
>         (vect_transform_loop): Likewise.
>         (vect_analyze_loop_costing): Don't take the cost of versioning
>         into account for the static profitability threshold if it turns
>         out that no versioning is needed.
>
> Index: gcc/tree-vectorizer.h
> ===================================================================
> --- gcc/tree-vectorizer.h       2019-11-05 11:14:42.786884473 +0000
> +++ gcc/tree-vectorizer.h       2019-11-05 14:19:33.829371745 +0000
> @@ -1557,6 +1557,17 @@ vect_get_scalar_dr_size (dr_vec_info *dr
>    return tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (dr_info->dr))));
>  }
>
> +/* Return true if LOOP_VINFO requires a runtime check for whether the
> +   vector loop is profitable.  */
> +
> +inline bool
> +vect_apply_runtime_profitability_check_p (loop_vec_info loop_vinfo)
> +{
> +  unsigned int th = LOOP_VINFO_COST_MODEL_THRESHOLD (loop_vinfo);
> +  return (!LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
> +         && th >= vect_vf_for_cost (loop_vinfo));
> +}
> +
>  /* Source location + hotness information. */
>  extern dump_user_location_t vect_location;
>
> Index: gcc/tree-vect-loop-manip.c
> ===================================================================
> --- gcc/tree-vect-loop-manip.c  2019-11-05 10:38:31.838181047 +0000
> +++ gcc/tree-vect-loop-manip.c  2019-11-05 14:19:33.825371773 +0000
> @@ -3173,8 +3173,7 @@ vect_loop_versioning (loop_vec_info loop
>      = LOOP_REQUIRES_VERSIONING_FOR_SIMD_IF_COND (loop_vinfo);
>    unsigned th = LOOP_VINFO_COST_MODEL_THRESHOLD (loop_vinfo);
>
> -  if (th >= vect_vf_for_cost (loop_vinfo)
> -      && !LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
> +  if (vect_apply_runtime_profitability_check_p (loop_vinfo)
>        && !ordered_p (th, versioning_threshold))
>      cond_expr = fold_build2 (GE_EXPR, boolean_type_node, scalar_loop_iters,
>                              build_int_cst (TREE_TYPE (scalar_loop_iters),
> Index: gcc/tree-vect-loop.c
> ===================================================================
> --- gcc/tree-vect-loop.c        2019-11-05 11:14:42.782884501 +0000
> +++ gcc/tree-vect-loop.c        2019-11-05 14:19:33.829371745 +0000
> @@ -1689,6 +1689,24 @@ vect_analyze_loop_costing (loop_vec_info
>        return 0;
>      }
>
> +  /* The static profitablity threshold min_profitable_estimate includes
> +     the cost of having to check at runtime whether the scalar loop
> +     should be used instead.  If it turns out that we don't need or want
> +     such a check, the threshold we should use for the static estimate
> +     is simply the point at which the vector loop becomes more profitable
> +     than the scalar loop.  */
> +  if (min_profitable_estimate > min_profitable_iters
> +      && !LOOP_REQUIRES_VERSIONING (loop_vinfo)
> +      && !LOOP_VINFO_PEELING_FOR_NITER (loop_vinfo)
> +      && !LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo)
> +      && !vect_apply_runtime_profitability_check_p (loop_vinfo))
> +    {
> +      if (dump_enabled_p ())
> +       dump_printf_loc (MSG_NOTE, vect_location, "no need for a runtime"
> +                        " choice between the scalar and vector loops\n");
> +      min_profitable_estimate = min_profitable_iters;
> +    }
> +
>    HOST_WIDE_INT estimated_niter;
>
>    /* If we are vectorizing an epilogue then we know the maximum number of
> @@ -2225,8 +2243,7 @@ vect_analyze_loop_2 (loop_vec_info loop_
>
>        /*  Use the same condition as vect_transform_loop to decide when to use
>           the cost to determine a versioning threshold.  */
> -      if (th >= vect_vf_for_cost (loop_vinfo)
> -         && !LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
> +      if (vect_apply_runtime_profitability_check_p (loop_vinfo)
>           && ordered_p (th, niters_th))
>         niters_th = ordered_max (poly_uint64 (th), niters_th);
>
> @@ -8268,14 +8285,13 @@ vect_transform_loop (loop_vec_info loop_
>       run at least the (estimated) vectorization factor number of times
>       checking is pointless, too.  */
>    th = LOOP_VINFO_COST_MODEL_THRESHOLD (loop_vinfo);
> -  if (th >= vect_vf_for_cost (loop_vinfo)
> -      && !LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo))
> +  if (vect_apply_runtime_profitability_check_p (loop_vinfo))
>      {
> -       if (dump_enabled_p ())
> -         dump_printf_loc (MSG_NOTE, vect_location,
> -                          "Profitability threshold is %d loop iterations.\n",
> -                          th);
> -       check_profitability = true;
> +      if (dump_enabled_p ())
> +       dump_printf_loc (MSG_NOTE, vect_location,
> +                        "Profitability threshold is %d loop iterations.\n",
> +                        th);
> +      check_profitability = true;
>      }
>
>    /* Make sure there exists a single-predecessor exit bb.  Do this before

Reply via email to