Hi all,
This is a series of patches/RFCs to implement support in GCC to be able
to target AArch64's libmvec functions that will be/are being added to glibc.
We have chosen to use the omp pragma '#pragma omp declare variant ...'
with a simd construct as the way for glibc to inform GCC what functions
are available.
For example, if we would like to supply a vector version of the scalar
'cosf' we would have an include file with something like:
typedef __attribute__((__neon_vector_type__(4))) float __f32x4_t;
typedef __attribute__((__neon_vector_type__(2))) float __f32x2_t;
typedef __SVFloat32_t __sv_f32_t;
typedef __SVBool_t __sv_bool_t;
__f32x4_t _ZGVnN4v_cosf (__f32x4_t);
__f32x2_t _ZGVnN2v_cosf (__f32x2_t);
__sv_f32_t _ZGVsMxv_cosf (__sv_f32_t, __sv_bool_t);
#pragma omp declare variant(_ZGVnN4v_cosf) \
match(construct = {simd(notinbranch, simdlen(4))}, device =
{isa("simd")})
#pragma omp declare variant(_ZGVnN2v_cosf) \
match(construct = {simd(notinbranch, simdlen(2))}, device =
{isa("simd")})
#pragma omp declare variant(_ZGVsMxv_cosf) \
match(construct = {simd(inbranch)}, device = {isa("sve")})
extern float cosf (float);
The BETA ABI can be found in the vfabia64 subdir of
https://github.com/ARM-software/abi-aa/
This currently disagrees with how this patch series implements 'omp
declare simd' for SVE and I also do not see a need for the 'omp declare
variant' scalable extension constructs. I will make changes to the ABI
once we've finalized the co-design of the ABI and this implementation.
The patch series has three main steps:
1) Add SVE support for 'omp declare simd', see PR 96342
2) Enable GCC to use omp declare variants with simd constructs as simd
clones during auto-vectorization.
3) Add SLP support for vectorizable_simd_clone_call (This sounded like a
nice thing to add as we want to move away from non-slp vectorization).
Below you can see the list of current Patches/RFCs, the difference being
on how confident I am of the proposed changes. For the RFC I am hoping
to get early comments on the approach, rather than more indepth
code-reviews.
I appreciate we are still in Stage 4, so I can completely understand if
you don't have time to review this now, but I thought it can't hurt to
post these early.
Andre Vieira:
[PATCH] omp: Replace simd_clone_subparts with TYPE_VECTOR_SUBPARTS
[PATCH] parloops: Copy target and optimizations when creating a function
clone
[PATCH] parloops: Allow poly nit and bound
[RFC] omp, aarch64: Add SVE support for 'omp declare simd' [PR 96342]
[RFC] omp: Create simd clones from 'omp declare variant's
[RFC] omp: Allow creation of simd clones from omp declare variant with
-fopenmp-simd flag
Work in progress:
[RFC] vect: Enable SLP codegen for vectorizable_simd_clone_call