Hi all,

This is a series of patches/RFCs to implement support in GCC to be able to target AArch64's libmvec functions that will be/are being added to glibc. We have chosen to use the omp pragma '#pragma omp declare variant ...' with a simd construct as the way for glibc to inform GCC what functions are available.

For example, if we would like to supply a vector version of the scalar 'cosf' we would have an include file with something like:
typedef __attribute__((__neon_vector_type__(4))) float __f32x4_t;
typedef __attribute__((__neon_vector_type__(2))) float __f32x2_t;
typedef __SVFloat32_t __sv_f32_t;
typedef __SVBool_t __sv_bool_t;
__f32x4_t _ZGVnN4v_cosf (__f32x4_t);
__f32x2_t _ZGVnN2v_cosf (__f32x2_t);
__sv_f32_t _ZGVsMxv_cosf (__sv_f32_t, __sv_bool_t);
#pragma omp declare variant(_ZGVnN4v_cosf) \
match(construct = {simd(notinbranch, simdlen(4))}, device = {isa("simd")})
#pragma omp declare variant(_ZGVnN2v_cosf) \
match(construct = {simd(notinbranch, simdlen(2))}, device = {isa("simd")})
#pragma omp declare variant(_ZGVsMxv_cosf) \
    match(construct = {simd(inbranch)}, device = {isa("sve")})
extern float cosf (float);

The BETA ABI can be found in the vfabia64 subdir of https://github.com/ARM-software/abi-aa/ This currently disagrees with how this patch series implements 'omp declare simd' for SVE and I also do not see a need for the 'omp declare variant' scalable extension constructs. I will make changes to the ABI once we've finalized the co-design of the ABI and this implementation.

The patch series has three main steps:
1) Add SVE support for 'omp declare simd', see PR 96342
2) Enable GCC to use omp declare variants with simd constructs as simd clones during auto-vectorization. 3) Add SLP support for vectorizable_simd_clone_call (This sounded like a nice thing to add as we want to move away from non-slp vectorization).

Below you can see the list of current Patches/RFCs, the difference being on how confident I am of the proposed changes. For the RFC I am hoping to get early comments on the approach, rather than more indepth code-reviews.

I appreciate we are still in Stage 4, so I can completely understand if you don't have time to review this now, but I thought it can't hurt to post these early.

Andre Vieira:
[PATCH] omp: Replace simd_clone_subparts with TYPE_VECTOR_SUBPARTS
[PATCH] parloops: Copy target and optimizations when creating a function clone
[PATCH] parloops: Allow poly nit and bound
[RFC] omp, aarch64: Add SVE support for 'omp declare simd' [PR 96342]
[RFC] omp: Create simd clones from 'omp declare variant's
[RFC] omp: Allow creation of simd clones from omp declare variant with -fopenmp-simd flag

Work in progress:
[RFC] vect: Enable SLP codegen for vectorizable_simd_clone_call

Reply via email to