On Wed, Oct 9, 2024 at 3:27 AM liuhongt <hongtao....@intel.com> wrote: > > >We'd also need to update the documentation: > > >... The @samp{very-cheap} model only > >allows vectorization if the vector code would entirely replace the > >scalar code that is being vectorized. For example, if each iteration > >of a vectorized loop would only be able to handle exactly four iterations > >of the scalar loop, the @samp{very-cheap} model would only allow > >vectorization if the scalar iteration count is known to be a multiple > >of four. > Changed. > > >And since it's a change in documented behaviour, it should probably > >be in the release notes too. > > Will submit another patch for that when it lands on trunk. > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}, > aarch64-unknown-linux-gnu{-m32,}. > > Ok for trunk?
OK. Richard. > gcc/ChangeLog: > > * tree-vect-loop.cc (vect_analyze_loop_costing): Enable > vectorization for LOOP_VINFO_PEELING_FOR_NITER in very cheap > cost model. > (vect_analyze_loop): Disable epilogue vectorization in very > cheap cost model. > * doc/invoke.texi: Adjust documents for very-cheap cost model. > --- > gcc/doc/invoke.texi | 11 ++++------- > gcc/tree-vect-loop.cc | 6 +++--- > 2 files changed, 7 insertions(+), 10 deletions(-) > > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > index b2f16b45eaf..edcadeb108a 100644 > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -14309,13 +14309,10 @@ counts that will likely execute faster than when > executing the original > scalar loop. The @samp{cheap} model disables vectorization of > loops where doing so would be cost prohibitive for example due to > required runtime checks for data dependence or alignment but otherwise > -is equal to the @samp{dynamic} model. The @samp{very-cheap} model only > -allows vectorization if the vector code would entirely replace the > -scalar code that is being vectorized. For example, if each iteration > -of a vectorized loop would only be able to handle exactly four iterations > -of the scalar loop, the @samp{very-cheap} model would only allow > -vectorization if the scalar iteration count is known to be a multiple > -of four. > +is equal to the @samp{dynamic} model. The @samp{very-cheap} model disables > +vectorization of loops when any runtime check for data dependence or > alignment > +is required, it also disables vectorization of epilogue loops but otherwise > is > +equal to the @samp{cheap} model. > > The default cost model depends on other optimization flags and is > either @samp{dynamic} or @samp{cheap}. > diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc > index 6933f597b4d..a76d3b8ea5f 100644 > --- a/gcc/tree-vect-loop.cc > +++ b/gcc/tree-vect-loop.cc > @@ -2375,8 +2375,7 @@ vect_analyze_loop_costing (loop_vec_info loop_vinfo, > a copy of the scalar code (even if we might be able to vectorize it). > */ > if (loop_cost_model (loop) == VECT_COST_MODEL_VERY_CHEAP > && (LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) > - || LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo) > - || LOOP_VINFO_PEELING_FOR_NITER (loop_vinfo))) > + || LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo))) > { > if (dump_enabled_p ()) > dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, > @@ -3681,7 +3680,8 @@ vect_analyze_loop (class loop *loop, gimple > *loop_vectorized_call, > /* No code motion support for multiple epilogues > so for now > not supported when multiple exits. */ > && !LOOP_VINFO_EARLY_BREAKS (first_loop_vinfo) > - && !loop->simduid); > + && !loop->simduid > + && loop_cost_model (loop) > > VECT_COST_MODEL_VERY_CHEAP); > if (!vect_epilogues) > return first_loop_vinfo; > > -- > 2.31.1 >