> What was it that made you propose this change?

When we have a loop of say int and a pragma unroll 4

If the vectorizer picks V4SI as the mode, the requested unroll ended up exactly 
matching the VF. As such the requested unroll is 1 and we don't clear the 
pragma.

So it did honor the requested unroll factor. However since we didn't set the 
unroll amount back and left it at 4 the rtl unroller won't use the rtl cost 
model at all and just unroll the vector loop 4 times.

This change isn't to bypass the rtl cost model, it's to allow it to be used 
rather than overriding it after vectorization.

Tamar
________________________________
From: Richard Biener <rguent...@suse.de>
Sent: Tuesday, September 2, 2025 12:10 PM
To: Tamar Christina <tamar.christ...@arm.com>
Cc: gcc-patches@gcc.gnu.org <gcc-patches@gcc.gnu.org>; nd <n...@arm.com>
Subject: Re: [PATCH 1/3]middle-end: clear the user unroll flag if the cost 
model has overriden it

On Tue, 2 Sep 2025, Tamar Christina wrote:

> If the user has requested loop unrolling through pragma GCC unroll then at the
> moment we only set LOOP_VINFO_USER_UNROLL if the vectorizer has not overrode 
> the
> unroll factor (through backend costing) or if the VF made the requested unroll
> factor be 1.
>
> But of these events are costing related, and so it stands to reason that we
> should set LOOP_VINFO_USER_UNROLL to we return the RTL unroller to use the
> backend costing for any further unrolling.

I can't parse this fully.

Hmm, but shouldn't we at least special-case when !new_res (aka
the requested unrolling wasn't used) but VF == 1, aka vectorization
didn't do any unrolling?

Or is the reasoning that the vector cost model is as good as the
RTL one (which doesn't exist...) and thus if that decided not to
unroll do not let the RTL cost model override it?  But that means
to set loop->unroll to 1, not 0 as we do for LOOP_VINFO_USER_UNROLL.
The RTL "cost mode" will simply unroll by 8 if possible.

What was it that made you propose this change?

Richard.

> Bootstrapped Regtested on aarch64-none-linux-gnu,
> arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
> -m32, -m64 and no issues.
>
> Ok for master? Tests using this are in the next patch.
>
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
>        * tree-vect-loop.cc (vect_analyze_loop_1): If the unroll pragma was set
>        mark it as handled.
>
> ---
> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> index 
> 3b56f865ec7c3fdb8e9def37ec446a25086b4e28..f393b45232b3390959a2be756db3aa8b35002bf0
>  100644
> --- a/gcc/tree-vect-loop.cc
> +++ b/gcc/tree-vect-loop.cc
> @@ -2941,11 +2941,13 @@ vect_analyze_loop_1 (class loop *loop, 
> vec_info_shared *shared,
>              {
>                delete loop_vinfo;
>                loop_vinfo = unroll_vinfo;
> -             LOOP_VINFO_USER_UNROLL (loop_vinfo) = user_unroll > 1;
>              }
>            else
>              delete unroll_vinfo;
>          }
> +
> +     /* Record that we have honored a user unroll factor.  */
> +     LOOP_VINFO_USER_UNROLL (loop_vinfo) = user_unroll > 1;
>      }
>
>    /* Remember the autodetected vector mode.  */
>
>
>

--
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Reply via email to