[Bug target/120157] No use of SVE early break vectorisation in FP loop

tnfchris at gcc dot gnu.org via Gcc-bugs Wed, 07 May 2025 07:20:54 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120157


--- Comment #5 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
(In reply to ktkachov from comment #4)
> > Ah indeed, -msve-vector-bits= does do what I expected. Feel free to close
> > this if it's not tracking anything new then.
> 
> Ok. FWIW the original testcase for me had doubles:
> int f11(double *x, double val, int n)
> {
>     int i;
>     for (i = 0; i < n; i++) {
>         if (x[i] == val) break;
>     }
>     return i;
> }
> 
> And with -msve-vector-bits=128 -mcpu=neoverse-v2  --param
> aarch64-autovec-preference=sve-only GCC refuses to vectorise and picks Neon
> without the aarch64-autovec-preference. I do see it vectorising with VLS SVE
> for wider widths, so it may be a V2 cost model thing.
> If choosing Neon is the right thing to do for V2 that's fine, but with
> --param aarch64-autovec-preference=sve-only it should probably use SVE
> rather than refusing to vectorise

Yeah I don't really know why it did that. Will need to have a look to see how
sve-only is implemented.

It looks like analysis did succeed for at least one SVE mode but it's correct
in that SVE would be more expensive here

/app/example.c:7:19: note:    Runtime profitability threshold = 1
/app/example.c:7:19: note:    Static estimate profitability threshold = 12
/app/example.c:7:19: note:  ***** Analysis succeeded with vector mode VNx2QI
/app/example.c:7:19: note:  Comparing two main loops (VNx2QI at VF 2 vs V2DF at
VF 4)
/app/example.c:7:19: note:  ***** Choosing vector mode V2DF

SVE is more expensive due to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119860

If you change your counter from `int` to `long` to match the load type
https://godbolt.org/z/dYc49ojM9 you get the code you expected.

This is a general vectorizer quirk that we'll hopefully address this release
too *knocks on wood* ...but haven't taken a look at Richard's old WIP patches
yet.

[Bug target/120157] No use of SVE early break vectorisation in FP loop

Reply via email to