The following patch series is reworked from its first version based on
Jakub's
review comments in
https://gcc.gnu.org/pipermail/gcc-patches/2024-August/659540.html
The changes in v2:
1. Moved all execute tests to under
libgomp/testsuite/libgomp.target/aarch64/.
2. Retained gcc/testsuite/gcc.target/aarch64/sve/omp/ for compile tests.
3. Handled offloading SVE types differently based on sizeless and
fixed-size
types. Also added more tests to check for VLA and VLS types.
4. Made tests more representative of real-world scenarios.
5. Converted some compile tests to execute tests.
6. For user-defined reductions, I have removed task and taskloop tests
for now.
I need to understand the constructs better before adding
meaningful tests.
7. One known fail where declare simd uniform clones a function to a
variant
to support a particular type in the clause. This fails on SVE
with a decl
without prototype error. It is unclear how this ought to be handled.
I went ahead and posted the rest of the series as I didn't want
this issue
to block the rest of the patches.
The following patch series handles various scenarios with OpenMP and
SVE types.
The starting point for the series follows a suggestion from Jakub to
cover all
the possible scenarios that could arise when OMP constructs/clauses
etc are
used with SVE ACLE types. Here are a few instances that this patch
series test
and in some cases fixes the expected output. This patch series does
not follow
a formal definition or a spec of how OMP interacts with SVE ACLE
types, so its
more of a proposed behaviour. Comments and discussion welcome.
This list is not exhaustive, but covers most scenarios of how SVE ACLE
types
ought to interact with OMP constructs/clauses.
1. Poly-int structures that represent variable-sized objects and OMP
runtime.
Currently poly-int type structures are passed by value to OpenMP runtime
functions for shared clauses etc. This patch improves on this by passing
around poly-int structures by address to avoid copy-overhead.
2. SVE ACLE types in OMP Shared clauses.
We test the behaviour where SVE ACLE type objects are shared in the
following
methods into an OMP region:
a. Explicit Shared clause on SVE ACLE type objects.
b. Implicit shared clause.
c. Implicit shared with default clause.
d. SVE ALCE types in the presence of predetemined (static) shared
objects.
The associated tests ensure that all such shared objects are passed by
address
into the OMP runtime. There are runtime tests to verify the functional
correctness of the change.
3. [tree] Add function to strip pointer type and get down to the
actual pointee type.
Adds a support function in tree.h to strip pointer types to drill down
to the pointee
type.
4. Offloading and SVE ACLE types.
The target clause in OpenMP is used to offload loop kernels to
accelarator
peripeherals. target's 'map' clause is used to move data from and to the
accelarator. When the data is sizeless SVE type, it may be unsuitable
due to
various reasons i.e. the two SVE targets may not agree on vector size or
some targets don't support variable vector size. This makes sizeless
SVE types
unsuitable for use in OMP's 'map' clause. We diagnose all such cases
and issue
errors where appropriate. The cases we cover in this patch are:
a. Implicitly-mapped SVE ACLE types in OMP target regions are
diagnosed.
b. Explicitly-mapped SVE ACLE types in OMP target regions using map
clause
are diagnosed.
c. Explicilty-mapped SVLE ACLE types of various directions - to,
from, tofrom
in the map clause are diagnosed.
d. target enter and exit data clauses with map on SVE ACLE types are
diagnosed.
e. target data map with alloc on SVE ACLE types are diagnosed.
f. target update from clause on SVE ACLE types are diagnosed.
g. target private firstprivate with SVE ACLE types are diagnosed.
h. All combinations of target with work-sharing constructs like
parallel,
loop, simd, teams, distribute etc are also diagnosed when SVE
ACLE types
are involved.
For a fixed size SVE vector types(eg. fixed by arm_sve_vector_bits
attribute),
we don't diagnose. Fixed size vectors are allowed to be used in OMP
offloading
constructs and clauses. The only caveat is that LTO streamers that
handle
streaming in the offloaded bytecode is expected to check for matching
vector
size and diagnose as the attribute sizes are also streamed out.
5. Lastprivate and SVE ACLE types.
Various OpenMP lastprivate clause scenarios with SVE object types are
diagnosed. Worksharing constructs like sections, for, distribute bind
to an
implicit outer parallel region in whose scope SVE ACLE types are
declared and
are therefore default private. The lastprivate clause list with SVE
ACLE type
object items are diagnosed in this scenario.
Execute tests have been added for checking functional behaviour.
6. Threadprivate on SVE ACLE type objects.
We ensure threadprivate SVE ACLE type objects are supported. We also
ensure
copyin clause is also supported.
7. User-Defined Reductions on SVE ACLE types.
We define a reduction using OMP declare reduction using SVE ACLE
intrinsics and
ensure its functional correctness with various work-sharing constructs
like
for, simd, parallel.
8. Uniform and Aligned Clause with SVE ACLE
We ensure the uniform clause's functional correctness with simd
construct and
associated SVE ACLE intrinsics in the simd region. There is no direct
interaction between uniform and SVE ACLE type objects, but we ensure
the uniform
clause applies correctly to a region where SVE ACLE intrinsics are
present.
Similarly for the aligned clause.
9. Linear clause and SVE ACLE type.
We diagnose if a linear clause list item has SVE ACLE type objects
present.
Its doesn't mean much if the linear clause is applied to SVE ACLE types.
10. Depend clause and SVE ACLE objects.
We test for functional correctness many combinations of dependency of
shared
SVE ACLE type objects in parallel regions. We test if in, out
dependencies and
anti-dependencies are supported for SVE ACLE type objects using the
depend
clause with work-sharing constructs like task.
11. 'doacross' clause and SVE ACLE object types.
doacross is mainly supported for scalars and loop iteration
variables. We
diagnose cases where SVE ACLE objects are used in doacross list items.
Tejas Belagod (12):
OpenMP/PolyInt: Pass poly-int structures by address to OMP libs.
libgomp, AArch64: Add test cases for SVE types in OpenMP shared
clause.
[tree] Add function to strip pointer type and get down to the actual
pointee type.
AArch64: Diagnose OpenMP offloading when SVE types involved.
libgomp, AArch64: Test OpenMP lastprivate clause for various
constructs.
libgomp, AArch64: Test OpenMP threadprivate clause on SVE type.
libgomp, AArch64: Test OpenMP user-defined reductions with SVE types.
libgomp, AArch64: Test OpenMP uniform clause on SVE types.
libgomp, AArch64: Test OpenMP simd aligned clause with SVE types.
AArch64: Diagnose OpenMP linear clause for SVE type objects.
libgomp, AArch64: Test OpenMP depend clause and its variations on SVE
types
AArch64: Diagnose SVE type objects when applied to OpenMP doacross
clause.
gcc/config/aarch64/aarch64-sve-builtins.cc | 52 +-
gcc/gimplify.cc | 34 +-
gcc/omp-low.cc | 3 +-
gcc/target.h | 19 +-
.../gcc.target/aarch64/sve/omp/doacross.c | 22 +
.../gcc.target/aarch64/sve/omp/gomp.exp | 46 ++
.../gcc.target/aarch64/sve/omp/lastprivate.c | 94 ++++
.../gcc.target/aarch64/sve/omp/linear.c | 85 ++++
.../aarch64/sve/omp/offload-parallel-loop.c | 442 +++++++++++++++++
.../aarch64/sve/omp/offload-parallel.c | 376 +++++++++++++++
.../gcc.target/aarch64/sve/omp/offload-simd.c | 442 +++++++++++++++++
.../sve/omp/offload-teams-distribute-simd.c | 442 +++++++++++++++++
.../sve/omp/offload-teams-distribute.c | 442 +++++++++++++++++
.../aarch64/sve/omp/offload-teams-loop.c | 442 +++++++++++++++++
.../aarch64/sve/omp/offload-teams.c | 365 ++++++++++++++
.../gcc.target/aarch64/sve/omp/offload.c | 452 ++++++++++++++++++
.../aarch64/sve/omp/target-device.c | 186 +++++++
.../gcc.target/aarch64/sve/omp/target-link.c | 54 +++
gcc/tree.h | 9 +
.../libgomp.target/aarch64/aarch64.exp | 57 +++
.../libgomp.target/aarch64/depend-1.c | 223 +++++++++
.../libgomp.target/aarch64/lastprivate.c | 162 +++++++
.../testsuite/libgomp.target/aarch64/shared.c | 186 +++++++
.../libgomp.target/aarch64/simd-aligned.c | 51 ++
.../libgomp.target/aarch64/simd-nontemporal.c | 50 ++
.../libgomp.target/aarch64/simd-uniform.c | 83 ++++
.../libgomp.target/aarch64/threadprivate.c | 48 ++
.../libgomp.target/aarch64/udr-sve.c | 108 +++++
28 files changed, 4971 insertions(+), 4 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/omp/doacross.c
create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/omp/gomp.exp
create mode 100644
gcc/testsuite/gcc.target/aarch64/sve/omp/lastprivate.c
create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/omp/linear.c
create mode 100644
gcc/testsuite/gcc.target/aarch64/sve/omp/offload-parallel-loop.c
create mode 100644
gcc/testsuite/gcc.target/aarch64/sve/omp/offload-parallel.c
create mode 100644
gcc/testsuite/gcc.target/aarch64/sve/omp/offload-simd.c
create mode 100644
gcc/testsuite/gcc.target/aarch64/sve/omp/offload-teams-distribute-simd.c
create mode 100644
gcc/testsuite/gcc.target/aarch64/sve/omp/offload-teams-distribute.c
create mode 100644
gcc/testsuite/gcc.target/aarch64/sve/omp/offload-teams-loop.c
create mode 100644
gcc/testsuite/gcc.target/aarch64/sve/omp/offload-teams.c
create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/omp/offload.c
create mode 100644
gcc/testsuite/gcc.target/aarch64/sve/omp/target-device.c
create mode 100644
gcc/testsuite/gcc.target/aarch64/sve/omp/target-link.c
create mode 100644 libgomp/testsuite/libgomp.target/aarch64/aarch64.exp
create mode 100644 libgomp/testsuite/libgomp.target/aarch64/depend-1.c
create mode 100644
libgomp/testsuite/libgomp.target/aarch64/lastprivate.c
create mode 100644 libgomp/testsuite/libgomp.target/aarch64/shared.c
create mode 100644
libgomp/testsuite/libgomp.target/aarch64/simd-aligned.c
create mode 100644
libgomp/testsuite/libgomp.target/aarch64/simd-nontemporal.c
create mode 100644
libgomp/testsuite/libgomp.target/aarch64/simd-uniform.c
create mode 100644
libgomp/testsuite/libgomp.target/aarch64/threadprivate.c
create mode 100644 libgomp/testsuite/libgomp.target/aarch64/udr-sve.c