[Bug tree-optimization/120598] Compiler is unable to vectorise scalar code

segher at gcc dot gnu.org via Gcc-bugs Thu, 19 Jun 2025 09:20:00 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120598


--- Comment #8 from Segher Boessenkool <segher at gcc dot gnu.org> ---
(In reply to Jeevitha from comment #6)
> The following dot_product function gets vectorized with the latest GCC trunk
> and gcc 15.1.0:
> 
> #include <cstdint>
> #include <cstddef>
> extern float dot_product(const int16_t *v1, const int16_t *v2, size_t len);
> float dot_product(const int16_t *v1, const int16_t *v2, size_t len)
> {
>     int64_t d = 0;
>     for (size_t i = 0; i < len; i++)
>         d += int32_t(v1[i]) * int32_t(v2[i]);
>     return static_cast<float>(d);
> }
> 
> 
> I observed that -O2 was used during compilation. However, for GCC versions
> earlier than 15, vectorization of this loop requires -O3. Since they are
> using the -O2 flag, GCC 15 necessary in this case.

Is that what the original code does?  Or does it convert every number to float
and then sum over that?

And, can you try to find out what patch to GCC 15 made this work at -O2?  In
case we want to backport anything, but also just to get a better grip on what
is happening  here :-)

[Bug tree-optimization/120598] Compiler is unable to vectorise scalar code

Reply via email to