On Wed, Oct 30, 2024 at 8:47 AM Kugan Vivekanandarajah <kvivekana...@nvidia.com> wrote: > > Hi Richard, > > > On 29 Oct 2024, at 8:33 pm, Richard Biener <richard.guent...@gmail.com> > > wrote: > > > > External email: Use caution opening links or attachments > > > > > > On Tue, Oct 29, 2024 at 9:24 AM Kugan Vivekanandarajah > > <kvivekana...@nvidia.com> wrote: > >> > >> Hi Richard, > >> Thanks for the review. > >> > >>> On 28 Oct 2024, at 9:18 pm, Richard Biener <richard.guent...@gmail.com> > >>> wrote: > >>> > >>> External email: Use caution opening links or attachments > >>> > >>> > >>> On Mon, Oct 28, 2024 at 9:35 AM Kugan Vivekanandarajah > >>> <kvivekana...@nvidia.com> wrote: > >>>> > >>>> Hi, > >>>> > >>>> When ifcvt version a loop, it sets dont_vectorize to the scalar loop. If > >>>> the > >>>> vector loop is not vectorized and removed, the scalar loop is still left > >>>> with > >>>> dont_vectorize. As a result, BB vectorization will not happen. > >>>> > >>>> This patch adds a new attribute called dont_loop_vectorize (that is > >>>> different > >>>> from general dont_vectorize) specifically for loops versioned. BB > >>>> vectorization > >>>> does not need to honour this and still can vectorize. > >>>> > >>>> Bootstrapped and regression tested on aarch64-linux-gnu with no new > >>>> regressions. > >>>> > >>>> Is this OK? > >>> > >>> I believe if-conversion never versions a loop that has > >>> ->dont_vectorize set so when > >>> the vectorizer elides the .LOOP_VECTORIZED test it can simply clear > >>> the flag again. > >>> > >>> I don't like adding a new flag, instead if the above doesn't work, the > >>> vectorizer > >>> should be changed how it identifies loop candidates, not relying on this > >>> flag. > >> > >> Here is a version where I am resetting dont_vectorize while folding > >> ffold_loop_internal_call with false. > >> > >> Bootstrapped and regression tested on aarch64-linux-gnu. Is this OK? > > > > Hmm, I'm not sure this way is super reliable given taken_edge->dest > > could be a preheader > > block. I see we have many calls to fold_loop_internal_call - one is > > even suspicious to break > > with your patch: > > > > /* If we are going to vectorize outer loop, prevent vectorization > > of the inner loop in the scalar loop - either the scalar loop is > > thrown away, so it is a wasted work, or is used only for > > a few iterations. */ > > if (scalar_loop->inner) > > { > > gimple *g = vect_loop_vectorized_call (scalar_loop->inner); > > if (g) > > { > > arg = gimple_call_arg (g, 0); > > get_loop (fun, tree_to_shwi (arg))->dont_vectorize = true; > > fold_loop_internal_call (g, boolean_false_node); > > > > does it work if you just do > > > > diff --git a/gcc/tree-vectorizer.cc b/gcc/tree-vectorizer.cc > > index af112f212fe..16fa0ec1bb7 100644 > > --- a/gcc/tree-vectorizer.cc > > +++ b/gcc/tree-vectorizer.cc > > @@ -1326,6 +1326,7 @@ pass_vectorize::execute (function *fun) > > if (g) > > { > > fold_loop_internal_call (g, boolean_false_node); > > + loop->dont_vectorize = false; > > ret |= TODO_cleanup_cfg; > > g = NULL; > > } > > @@ -1335,6 +1336,7 @@ pass_vectorize::execute (function *fun) > > if (g) > > { > > fold_loop_internal_call (g, boolean_false_node); > > + loop->dont_vectorize = false; > > ret |= TODO_cleanup_cfg; > > } > > } > > > > ? > > Yes, this works. Bootstrapped and regression tested on aarch64-linux-gnu. Is > this OK?
OK. Thanks, Richard. > Thanks, > Kugan > > > > > > >> Thanks, > >> Kugan > >> > >> > >>> > >>> Richard. > >>> > >>>> Thanks, > >>>> Kugan > >