I am testing the latest GCC with not-yet-submitted GLIBC changes that
implement libmvec on Aarch64.

While trying to run SPEC 2017 (specifically 521.wrf_r) I ran into a
case where GCC was generating a call to _ZGVnN2vv_powf, that is a
vectorized powf call for 2 (not 4) elements.  This was a problem
because I only implemented a 4 element 32 bit vectorized powf function
for libmvec and not a 2 element version.

I think this is due to aarch64_simd_clone_compute_vecsize_and_simdlen
which allows for (element count * element size) to be either 64
or 128.

I would like some thoughts on what we should do about this, should
we require glibc/libmvec to provide 2 element 32 bit floating point
vector functions (as well as the 4 element ones) or should we change
aarch64_simd_clone_compute_vecsize_and_simdlen to only allow 4
element (128 total bit size) vectors and not 2 element (64 total bit
size) ones?

This is obviously a question for the pre-SVE vector instructions,
I am not sure how this would be handled in SVE.

Steve Ellcey
sell...@marvell.com

P.S.  Here a test case in Fortran that generated the 2 element
      vector call.  It unrolled the loop into one vector call
      of 2 elements and one scalar call.

      SUBROUTINE FOO(B,W,P)
      REAL, DIMENSION (3) :: W, P
      DO 10 I = 1, 3
      P(I) = W(I) ** B
10    CONTINUE
      END SUBROUTINE FOO

Reply via email to