Hi Richard, > On 29 Oct 2024, at 8:33 pm, Richard Biener <richard.guent...@gmail.com> wrote: > > External email: Use caution opening links or attachments > > > On Tue, Oct 29, 2024 at 9:24 AM Kugan Vivekanandarajah > <kvivekana...@nvidia.com> wrote: >> >> Hi Richard, >> Thanks for the review. >> >>> On 28 Oct 2024, at 9:18 pm, Richard Biener <richard.guent...@gmail.com> >>> wrote: >>> >>> External email: Use caution opening links or attachments >>> >>> >>> On Mon, Oct 28, 2024 at 9:35 AM Kugan Vivekanandarajah >>> <kvivekana...@nvidia.com> wrote: >>>> >>>> Hi, >>>> >>>> When ifcvt version a loop, it sets dont_vectorize to the scalar loop. If >>>> the >>>> vector loop is not vectorized and removed, the scalar loop is still left >>>> with >>>> dont_vectorize. As a result, BB vectorization will not happen. >>>> >>>> This patch adds a new attribute called dont_loop_vectorize (that is >>>> different >>>> from general dont_vectorize) specifically for loops versioned. BB >>>> vectorization >>>> does not need to honour this and still can vectorize. >>>> >>>> Bootstrapped and regression tested on aarch64-linux-gnu with no new >>>> regressions. >>>> >>>> Is this OK? >>> >>> I believe if-conversion never versions a loop that has >>> ->dont_vectorize set so when >>> the vectorizer elides the .LOOP_VECTORIZED test it can simply clear >>> the flag again. >>> >>> I don't like adding a new flag, instead if the above doesn't work, the >>> vectorizer >>> should be changed how it identifies loop candidates, not relying on this >>> flag. >> >> Here is a version where I am resetting dont_vectorize while folding >> ffold_loop_internal_call with false. >> >> Bootstrapped and regression tested on aarch64-linux-gnu. Is this OK? > > Hmm, I'm not sure this way is super reliable given taken_edge->dest > could be a preheader > block. I see we have many calls to fold_loop_internal_call - one is > even suspicious to break > with your patch: > > /* If we are going to vectorize outer loop, prevent vectorization > of the inner loop in the scalar loop - either the scalar loop is > thrown away, so it is a wasted work, or is used only for > a few iterations. */ > if (scalar_loop->inner) > { > gimple *g = vect_loop_vectorized_call (scalar_loop->inner); > if (g) > { > arg = gimple_call_arg (g, 0); > get_loop (fun, tree_to_shwi (arg))->dont_vectorize = true; > fold_loop_internal_call (g, boolean_false_node); > > does it work if you just do > > diff --git a/gcc/tree-vectorizer.cc b/gcc/tree-vectorizer.cc > index af112f212fe..16fa0ec1bb7 100644 > --- a/gcc/tree-vectorizer.cc > +++ b/gcc/tree-vectorizer.cc > @@ -1326,6 +1326,7 @@ pass_vectorize::execute (function *fun) > if (g) > { > fold_loop_internal_call (g, boolean_false_node); > + loop->dont_vectorize = false; > ret |= TODO_cleanup_cfg; > g = NULL; > } > @@ -1335,6 +1336,7 @@ pass_vectorize::execute (function *fun) > if (g) > { > fold_loop_internal_call (g, boolean_false_node); > + loop->dont_vectorize = false; > ret |= TODO_cleanup_cfg; > } > } > > ?
Yes, this works. Bootstrapped and regression tested on aarch64-linux-gnu. Is this OK? Thanks, Kugan > >> Thanks, >> Kugan >> >> >>> >>> Richard. >>> >>>> Thanks, >>>> Kugan
0001-PATCH-Fix-SLP-when-ifcvt-versioned-loop-is-not-vecto_v3.patch
Description: 0001-PATCH-Fix-SLP-when-ifcvt-versioned-loop-is-not-vecto_v3.patch