On Wed, Oct 9, 2024 at 3:27 AM liuhongt <hongtao....@intel.com> wrote:
>
> >We'd also need to update the documentation:
>
> >... The @samp{very-cheap} model only
> >allows vectorization if the vector code would entirely replace the
> >scalar code that is being vectorized.  For example, if each iteration
> >of a vectorized loop would only be able to handle exactly four iterations
> >of the scalar loop, the @samp{very-cheap} model would only allow
> >vectorization if the scalar iteration count is known to be a multiple
> >of four.
> Changed.
>
> >And since it's a change in documented behaviour, it should probably
> >be in the release notes too.
>
> Will submit another patch for that when it lands on trunk.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}, 
> aarch64-unknown-linux-gnu{-m32,}.
>
> Ok for trunk?

OK.

Richard.

> gcc/ChangeLog:
>
>         * tree-vect-loop.cc (vect_analyze_loop_costing): Enable
>         vectorization for LOOP_VINFO_PEELING_FOR_NITER in very cheap
>         cost model.
>         (vect_analyze_loop): Disable epilogue vectorization in very
>         cheap cost model.
>         * doc/invoke.texi: Adjust documents for very-cheap cost model.
> ---
>  gcc/doc/invoke.texi   | 11 ++++-------
>  gcc/tree-vect-loop.cc |  6 +++---
>  2 files changed, 7 insertions(+), 10 deletions(-)
>
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index b2f16b45eaf..edcadeb108a 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -14309,13 +14309,10 @@ counts that will likely execute faster than when 
> executing the original
>  scalar loop.  The @samp{cheap} model disables vectorization of
>  loops where doing so would be cost prohibitive for example due to
>  required runtime checks for data dependence or alignment but otherwise
> -is equal to the @samp{dynamic} model.  The @samp{very-cheap} model only
> -allows vectorization if the vector code would entirely replace the
> -scalar code that is being vectorized.  For example, if each iteration
> -of a vectorized loop would only be able to handle exactly four iterations
> -of the scalar loop, the @samp{very-cheap} model would only allow
> -vectorization if the scalar iteration count is known to be a multiple
> -of four.
> +is equal to the @samp{dynamic} model.  The @samp{very-cheap} model disables
> +vectorization of loops when any runtime check for data dependence or 
> alignment
> +is required, it also disables vectorization of epilogue loops but otherwise 
> is
> +equal to the @samp{cheap} model.
>
>  The default cost model depends on other optimization flags and is
>  either @samp{dynamic} or @samp{cheap}.
> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> index 6933f597b4d..a76d3b8ea5f 100644
> --- a/gcc/tree-vect-loop.cc
> +++ b/gcc/tree-vect-loop.cc
> @@ -2375,8 +2375,7 @@ vect_analyze_loop_costing (loop_vec_info loop_vinfo,
>       a copy of the scalar code (even if we might be able to vectorize it).  
> */
>    if (loop_cost_model (loop) == VECT_COST_MODEL_VERY_CHEAP
>        && (LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo)
> -         || LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo)
> -         || LOOP_VINFO_PEELING_FOR_NITER (loop_vinfo)))
> +         || LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo)))
>      {
>        if (dump_enabled_p ())
>         dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> @@ -3681,7 +3680,8 @@ vect_analyze_loop (class loop *loop, gimple 
> *loop_vectorized_call,
>                            /* No code motion support for multiple epilogues 
> so for now
>                               not supported when multiple exits.  */
>                          && !LOOP_VINFO_EARLY_BREAKS (first_loop_vinfo)
> -                        && !loop->simduid);
> +                        && !loop->simduid
> +                        && loop_cost_model (loop) > 
> VECT_COST_MODEL_VERY_CHEAP);
>    if (!vect_epilogues)
>      return first_loop_vinfo;
>
> --
> 2.31.1
>

Reply via email to