Hi Richard,

> On 29 Oct 2024, at 8:33 pm, Richard Biener <richard.guent...@gmail.com> wrote:
>
> External email: Use caution opening links or attachments
>
>
> On Tue, Oct 29, 2024 at 9:24 AM Kugan Vivekanandarajah
> <kvivekana...@nvidia.com> wrote:
>>
>> Hi Richard,
>> Thanks for the review.
>>
>>> On 28 Oct 2024, at 9:18 pm, Richard Biener <richard.guent...@gmail.com> 
>>> wrote:
>>>
>>> External email: Use caution opening links or attachments
>>>
>>>
>>> On Mon, Oct 28, 2024 at 9:35 AM Kugan Vivekanandarajah
>>> <kvivekana...@nvidia.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> When ifcvt version a loop, it sets dont_vectorize to the scalar loop. If 
>>>> the
>>>> vector loop is not vectorized and removed, the scalar loop is still left 
>>>> with
>>>> dont_vectorize. As a result, BB vectorization will not happen.
>>>>
>>>> This patch adds a new attribute called dont_loop_vectorize (that is 
>>>> different
>>>> from general dont_vectorize) specifically for loops versioned. BB 
>>>> vectorization
>>>> does not need to honour this and still can vectorize.
>>>>
>>>> Bootstrapped and regression tested on aarch64-linux-gnu with no new 
>>>> regressions.
>>>>
>>>> Is this OK?
>>>
>>> I believe if-conversion never versions a loop that has
>>> ->dont_vectorize set so when
>>> the vectorizer elides the .LOOP_VECTORIZED test it can simply clear
>>> the flag again.
>>>
>>> I don't like adding a new flag, instead if the above doesn't work, the
>>> vectorizer
>>> should be changed how it identifies loop candidates, not relying on this 
>>> flag.
>>
>> Here is a version where I am resetting dont_vectorize while folding 
>> ffold_loop_internal_call with false.
>>
>> Bootstrapped and regression tested on aarch64-linux-gnu. Is this OK?
>
> Hmm, I'm not sure this way is super reliable given taken_edge->dest
> could be a preheader
> block.  I see we have many calls to fold_loop_internal_call - one is
> even suspicious to break
> with your patch:
>
>  /* If we are going to vectorize outer loop, prevent vectorization
>     of the inner loop in the scalar loop - either the scalar loop is
>     thrown away, so it is a wasted work, or is used only for
>     a few iterations.  */
>  if (scalar_loop->inner)
>    {
>      gimple *g = vect_loop_vectorized_call (scalar_loop->inner);
>      if (g)
>        {
>          arg = gimple_call_arg (g, 0);
>          get_loop (fun, tree_to_shwi (arg))->dont_vectorize = true;
>          fold_loop_internal_call (g, boolean_false_node);
>
> does it work if you just do
>
> diff --git a/gcc/tree-vectorizer.cc b/gcc/tree-vectorizer.cc
> index af112f212fe..16fa0ec1bb7 100644
> --- a/gcc/tree-vectorizer.cc
> +++ b/gcc/tree-vectorizer.cc
> @@ -1326,6 +1326,7 @@ pass_vectorize::execute (function *fun)
>            if (g)
>              {
>                fold_loop_internal_call (g, boolean_false_node);
> +               loop->dont_vectorize = false;
>                ret |= TODO_cleanup_cfg;
>                g = NULL;
>              }
> @@ -1335,6 +1336,7 @@ pass_vectorize::execute (function *fun)
>            if (g)
>              {
>                fold_loop_internal_call (g, boolean_false_node);
> +               loop->dont_vectorize = false;
>                ret |= TODO_cleanup_cfg;
>              }
>          }
>
> ?

Yes, this works. Bootstrapped and regression tested on aarch64-linux-gnu. Is 
this OK?

Thanks,
Kugan



>
>> Thanks,
>> Kugan
>>
>>
>>>
>>> Richard.
>>>
>>>> Thanks,
>>>> Kugan


Attachment: 0001-PATCH-Fix-SLP-when-ifcvt-versioned-loop-is-not-vecto_v3.patch
Description: 0001-PATCH-Fix-SLP-when-ifcvt-versioned-loop-is-not-vecto_v3.patch

Reply via email to