14 Regression] vectorizer generates calls to vector math routines with 1 simd lane.

rguenth at gcc dot gnu.org via Gcc-bugs Tue, 23 Jan 2024 02:39:59 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113552


Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Tamar Christina from comment #5)
> __attribute__ ((__simd__ ("notinbranch"), const))
> double cos (double);

So here the backend is then probably responsible to parse this into a valid
list of simdlen cases.

> void foo (float *a, double *b)
> {
>     for (int i = 0; i < 12; i+=3)
>       {
>         b[i] = cos (5.0 * a[i]);
>         b[i+1] = cos (5.0 * a[i+1]);
>         b[i+2] = cos (5.0 * a[i+2]);
>       }
> }
> 
> Simple C example that shows the problem.
> 
> This seems to happen when SLP succeeds and the group size is a non power of
> two.
> The vectorizer then unrolls to make it a power of two and during
> vectorization
> it seems to destroy the vector, make the call and reconstruct it.
> 
> So this seems like an SLP vectorization bug.  I can't seem to trigger it
> however on GCC < 14 since SLP consistently fails for all my examples because
> it tries a mode that's larger than the vector size.

On the 13 branch and x86_64 the above results in a large VF and using
_ZGVbN2v_cos, same on trunk.

> So It may be a GCC 14 only regression, but I think it's latent in the
> vectorizer.

I think there's sth odd with the backend here, but I can confirm the
behavior.  Note it analyzes and costs VF == 4 and V2DF resulting in
6 calls but then code generation comes along doing sth different!?

[Bug tree-optimization/113552] [11/12/13/14 Regression] vectorizer generates calls to vector math routines with 1 simd lane.

Reply via email to