On 06/11/2024 15:41, Jakub Jelinek wrote:
On Wed, Nov 06, 2024 at 03:27:22PM +0000, Andrew Stubbs wrote:
Ensure that the GOMP_MAX_VF does the right thing for explicit schedules, when
offloading is enabled ("target" directives are present), and is inactive
otherwise.
This requires enabling the offload-dump scanning features previously only used
in the libgomp testsuite. The automake scheme used there isn't a good fit
here, so we probe the known devices manually.
gcc/testsuite/ChangeLog:
* gcc.dg/gomp/gomp.exp: Load scanoffload.exp and scanoffloadtree.exp.
Set offload_targets when available.
* gcc.dg/gomp/max_vf-1.c: New test.
* gcc.dg/gomp/max_vf-2.c: New test.
* gcc.dg/gomp/max_vf-3.c: New test.
I don't see how this can work. gomp.exp isn't prepared to find the libgomp
directory nor add -B options etc. Perhaps it appears to work if you have
your system gcc's libgomp installed, but that isn't the library that should
be used.
So, max_vf-1.c test can stay where it is, but the gomp.exp changes shouldn't
be done and max_vf-{2,3}.c should move to libgomp/testsuite/libgomp.c/
It worked for me, but I might have an unusual configuration that allows
me to test installed toolchains with remote devices, rather than build
trees with local devices.
Is this version OK?
Andrew
From c46061db946ab28968b36c0d2cb65b6093a144ee Mon Sep 17 00:00:00 2001
From: Andrew Stubbs <a...@baylibre.com>
Date: Wed, 6 Nov 2024 12:26:08 +0000
Subject: [PATCH] openmp: Add testcases for omp_max_vf
Ensure that the GOMP_MAX_VF does the right thing for explicit schedules, when
offloading is enabled ("target" directives are present), and is inactive
otherwise.
libgomp/ChangeLog:
* testsuite/libgomp.c/max_vf-1.c: New test.
* testsuite/libgomp.c/max_vf-2.c: New test.
gcc/testsuite/ChangeLog:
* gcc.dg/gomp/max_vf-1.c: New test.
---
gcc/testsuite/gcc.dg/gomp/max_vf-1.c | 37 ++++++++++++++++++++
libgomp/testsuite/libgomp.c/max_vf-1.c | 47 ++++++++++++++++++++++++++
libgomp/testsuite/libgomp.c/max_vf-2.c | 21 ++++++++++++
3 files changed, 105 insertions(+)
create mode 100644 gcc/testsuite/gcc.dg/gomp/max_vf-1.c
create mode 100644 libgomp/testsuite/libgomp.c/max_vf-1.c
create mode 100644 libgomp/testsuite/libgomp.c/max_vf-2.c
diff --git a/gcc/testsuite/gcc.dg/gomp/max_vf-1.c b/gcc/testsuite/gcc.dg/gomp/max_vf-1.c
new file mode 100644
index 00000000000..0513aae226c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/gomp/max_vf-1.c
@@ -0,0 +1,37 @@
+/* Test that omp parallel simd schedule uses the correct max_vf for the
+ host system, when no target directives are present. */
+
+/* { dg-do compile } */
+/* { dg-options "-fopenmp -O2 -fdump-tree-ompexp" } */
+
+/* Fix a max_vf size so we can scan for it.
+{ dg-additional-options "-msse2" { target { x86_64-*-* i?86-*-* } } } */
+
+#define N 1024
+int a[N], b[N], c[N];
+
+void
+f2 (void)
+{
+ int i;
+ #pragma omp parallel for simd schedule (simd: static, 7)
+ for (i = 0; i < N; i++)
+ a[i] = b[i] + c[i];
+}
+
+/* Make sure the max_vf is inlined as a number.
+ Hopefully there are no unrelated uses of these numbers ...
+{ dg-final { scan-tree-dump-times {\* 16} 2 "ompexp" { target { x86_64-*-* } } } }
+{ dg-final { scan-tree-dump-times {\+ 16} 1 "ompexp" { target { x86_64-*-* } } } } */
+
+void
+f3 (int *a, int *b, int *c)
+{
+ int i;
+ #pragma omp parallel for simd schedule (simd : dynamic, 7)
+ for (i = 0; i < N; i++)
+ a[i] = b[i] + c[i];
+}
+
+/* Make sure the max_vf is inlined as a number.
+{ dg-final { scan-tree-dump-times {__builtin_GOMP_parallel_loop_nonmonotonic_dynamic \(.*, 16, 0\);} 1 "ompexp" { target { x86_64-*-* } } } } */
diff --git a/libgomp/testsuite/libgomp.c/max_vf-1.c b/libgomp/testsuite/libgomp.c/max_vf-1.c
new file mode 100644
index 00000000000..be900c565a3
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/max_vf-1.c
@@ -0,0 +1,47 @@
+/* Test that omp parallel simd schedule uses the correct max_vf for the
+ host system, when target directives are present. */
+
+/* { dg-require-effective-target offloading_enabled } */
+
+/* { dg-do link } */
+/* { dg-options "-fopenmp -O2 -fdump-tree-ompexp -foffload=-fdump-tree-optimized" } */
+
+/* Fix a max_vf size so we can scan for it.
+{ dg-additional-options "-msse2" { target { x86_64-*-* i?86-*-* } } } */
+
+#define N 1024
+int a[N], b[N], c[N];
+
+/* Test both static schedules and inline target directives. */
+void
+f2 (void)
+{
+ int i;
+ #pragma omp target parallel for simd schedule (simd: static, 7)
+ for (i = 0; i < N; i++)
+ a[i] = b[i] + c[i];
+}
+
+/* Test both dynamic schedules and declare target functions. */
+#pragma omp declare target
+void
+f3 (int *a, int *b, int *c)
+{
+ int i;
+ #pragma omp parallel for simd schedule (simd : dynamic, 7)
+ for (i = 0; i < N; i++)
+ a[i] = b[i] + c[i];
+}
+#pragma omp end declare target
+
+/* Make sure that the max_vf is used as an IFN.
+{ dg-final { scan-tree-dump-times {GOMP_MAX_VF} 2 "ompexp" { target { x86_64-*-* i?86-*-* } } } } */
+
+/* Make sure the max_vf is passed as a temporary variable.
+{ dg-final { scan-tree-dump-times {__builtin_GOMP_parallel_loop_nonmonotonic_dynamic \(.*, D\.[0-9]*, 0\);} 1 "ompexp" { target { x86_64-*-* i?86-*-* } } } } */
+
+/* Test SIMD offload devices
+{ dg-final { scan-offload-tree-dump-times {__builtin_GOMP_parallel_loop_nonmonotonic_dynamic \(.*, 64, 0\);} 1 "optimized" { target { offload_gcn } } } }
+{ dg-final { scan-offload-tree-dump-times {__builtin_GOMP_parallel_loop_nonmonotonic_dynamic \(.*, 7, 0\);} 1 "optimized" { target { offload_nvptx } } } } */
+
+int main() {}
diff --git a/libgomp/testsuite/libgomp.c/max_vf-2.c b/libgomp/testsuite/libgomp.c/max_vf-2.c
new file mode 100644
index 00000000000..91744c309df
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/max_vf-2.c
@@ -0,0 +1,21 @@
+/* Ensure that the default safelen is set correctly for the larger of the host
+ and offload device, to prevent defeating the vectorizer. */
+
+/* { dg-require-effective-target offloading_enabled } */
+
+/* { dg-do link } */
+/* { dg-options "-fopenmp -O2 -fdump-tree-omplower" } */
+
+int f(float *a, float *b, int n)
+{
+ float sum = 0;
+ #pragma omp target teams distribute parallel for simd map(tofrom: sum) reduction(+:sum)
+ for (int i = 0; i < n; i++)
+ sum += a[i] * b[i];
+ return sum;
+}
+
+/* Make sure that the max_vf used is suitable for the offload device.
+{ dg-final { scan-tree-dump-times {omp simd safelen\(64\)} 1 "omplower" { target { offload_gcn } } } } */
+
+int main() {}
--
2.46.0