"Andre Vieira (lists)" <andre.simoesdiasvie...@arm.com> writes: > Hi, > > This patch enables the use of mixed-types for simd clones for AArch64 > and adds aarch64 as a target_vect_simd_clones. > > Bootstrapped and regression tested on aarch64-unknown-linux-gnu > > gcc/ChangeLog: > > * config/aarch64/aarch64.cc (currently_supported_simd_type): > Remove. > (aarch64_simd_clone_compute_vecsize_and_simdlen): Use NFS type > to determine simdlen. > > gcc/testsuite/ChangeLog: > > * lib/target-supports.exp: Add aarch64 targets to vect_simd_clones. > * c-c++-common/gomp/declare-variant-14.c: Add aarch64 checks > and remove warning check. > * g++.dg/gomp/attrs-10.C: Likewise. > * g++.dg/gomp/declare-simd-1.C: Likewise. > * g++.dg/gomp/declare-simd-3.C: Likewise. > * g++.dg/gomp/declare-simd-4.C: Likewise. > * gcc.dg/gomp/declare-simd-3.c: Likewise. > * gcc.dg/gomp/simd-clones-2.c: Likewise. > * gfortran.dg/gomp/declare-variant-14.f90: Likewise. > * c-c++-common/gomp/pr60823-1.c: Remove warning check. > * c-c++-common/gomp/pr60823-3.c: Likewise. > * g++.dg/gomp/declare-simd-7.C: Likewise. > * g++.dg/gomp/declare-simd-8.C: Likewise. > * g++.dg/gomp/pr88182.C: Likewise. > * gcc.dg/declare-simd.c: Likewise. > * gcc.dg/gomp/declare-simd-1.c: Likewise. > * gcc.dg/gomp/pr87895-1.c: Likewise. > * gfortran.dg/gomp/declare-simd-2.f90: Likewise. > * gfortran.dg/gomp/declare-simd-coarray-lib.f90: Likewise. > * gfortran.dg/gomp/pr79154-1.f90: Likewise. > * gfortran.dg/gomp/pr83977.f90: Likewise. > * gcc.dg/gomp/pr87887-1.c: Add warning test. > * gcc.dg/gomp/pr89246-1.c: Likewise. > * gcc.dg/gomp/pr99542.c: Update warning test. > > diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc > index > 560e5431636ef46c41d56faa0c4e95be78f64b50..ac6350a44481628a947a0f20e034acf92cde63ec > 100644 > --- a/gcc/config/aarch64/aarch64.cc > +++ b/gcc/config/aarch64/aarch64.cc > @@ -27194,21 +27194,6 @@ supported_simd_type (tree t) > return false; > } > > -/* Return true for types that currently are supported as SIMD return > - or argument types. */ > - > -static bool > -currently_supported_simd_type (tree t, tree b) > -{ > - if (COMPLEX_FLOAT_TYPE_P (t)) > - return false; > - > - if (TYPE_SIZE (t) != TYPE_SIZE (b)) > - return false; > - > - return supported_simd_type (t); > -} > - > /* Implement TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN. */ > > static int > @@ -27217,7 +27202,7 @@ aarch64_simd_clone_compute_vecsize_and_simdlen > (struct cgraph_node *node, > tree base_type, int num, > bool explicit_p) > { > - tree t, ret_type; > + tree t, ret_type, nfs_type; > unsigned int elt_bits, count; > unsigned HOST_WIDE_INT const_simdlen; > poly_uint64 vec_bits; > @@ -27240,55 +27225,61 @@ aarch64_simd_clone_compute_vecsize_and_simdlen > (struct cgraph_node *node, > } > > ret_type = TREE_TYPE (TREE_TYPE (node->decl)); > + /* According to AArch64's Vector ABI the type that determines the simdlen > is > + the narrowest of types, so we ignore base_type for AArch64. */ > if (TREE_CODE (ret_type) != VOID_TYPE > - && !currently_supported_simd_type (ret_type, base_type)) > + && !supported_simd_type (ret_type)) > { > if (!explicit_p) > ; > - else if (TYPE_SIZE (ret_type) != TYPE_SIZE (base_type)) > - warning_at (DECL_SOURCE_LOCATION (node->decl), 0, > - "GCC does not currently support mixed size types " > - "for %<simd%> functions"); > - else if (supported_simd_type (ret_type)) > + else if (COMPLEX_FLOAT_TYPE_P (ret_type)) > warning_at (DECL_SOURCE_LOCATION (node->decl), 0, > "GCC does not currently support return type %qT " > - "for %<simd%> functions", ret_type); > + "for simd", ret_type); > else > warning_at (DECL_SOURCE_LOCATION (node->decl), 0, > - "unsupported return type %qT for %<simd%> functions", > + "unsupported return type %qT for simd", > ret_type);
What's the reason for s/%<simd%> functions/simd/, in particular for dropping the quotes around simd? > return 0; > } > > + nfs_type = ret_type; Genuine question, but what does nfs stand for in this context? > int i; > tree type_arg_types = TYPE_ARG_TYPES (TREE_TYPE (node->decl)); > bool decl_arg_p = (node->definition || type_arg_types == NULL_TREE); > - > for (t = (decl_arg_p ? DECL_ARGUMENTS (node->decl) : type_arg_types), i = > 0; > t && t != void_list_node; t = TREE_CHAIN (t), i++) > { > tree arg_type = decl_arg_p ? TREE_TYPE (t) : TREE_VALUE (t); > - > if (clonei->args[i].arg_type != SIMD_CLONE_ARG_TYPE_UNIFORM > - && !currently_supported_simd_type (arg_type, base_type)) > + && !supported_simd_type (arg_type)) > { > if (!explicit_p) > ; > - else if (TYPE_SIZE (arg_type) != TYPE_SIZE (base_type)) > + else if (COMPLEX_FLOAT_TYPE_P (ret_type)) > warning_at (DECL_SOURCE_LOCATION (node->decl), 0, > - "GCC does not currently support mixed size types " > - "for %<simd%> functions"); > + "GCC does not currently support argument type %qT " > + "for simd", arg_type); > else > warning_at (DECL_SOURCE_LOCATION (node->decl), 0, > - "GCC does not currently support argument type %qT " > - "for %<simd%> functions", arg_type); > + "unsupported argument type %qT for simd", > + arg_type); > return 0; > } > + if (clonei->args[i].arg_type != SIMD_CLONE_ARG_TYPE_UNIFORM > + && (TREE_CODE (nfs_type) == VOID_TYPE > + || TYPE_PRECISION (nfs_type) > TYPE_PRECISION (arg_type))) > + nfs_type = arg_type; > } > > + /* If we could not determine the nfs_type from available parameters/return, > + then fallback to using base_type. */ > + if (TREE_CODE (nfs_type) == VOID_TYPE) > + nfs_type = base_type; I don't think this implements the NDS calculation in the spec: The `Narrowest Data Size of f`, or ``NDS(f)``, as the minumum of the lane size ``LS(P)`` among all input parameters and return value ``<P>`` of ``f``. ... We then define the `Lane Size of P`, or ``LS(P)``, as follows. 1. If ``MTV(P)`` is ``false`` and ``P`` is a pointer or reference to some type ``T`` for which ``PBV(T)`` is ``true``, ``LS(P) = sizeof(T)``. 2. If ``PBV(T(P))`` is ``true``, ``LS(P) = sizeof(P)``. 3. Otherwise ``LS(P) = sizeof(uintptr_t)``. AIUI, (1) means that we need to look at the targets of uniform and linear scalars[*] that have pointer type, so that e.g. a uniform uint8_t * pointer should cause NDS to be 1. [*] i.e. arguments that remain scalar in the vector prototype (2) means that other types of uniform and linear scalars do contribute. A uniform uint8_t should cause NDS to be 1. Thanks, Richard