Re: [PATCH][libgomp, testsuite, nvptx] Limit recursion in declare_target-{1,2}.f90

Tom de Vries via Gcc-patches Fri, 01 Apr 2022 08:35:04 -0700

On 4/1/22 14:28, Thomas Schwinge wrote:

Hi Tom!


On 2022-04-01T13:24:40+0200, Tom de Vries <tdevr...@suse.de> wrote:

When running testcases libgomp.fortran/examples-4/declare_target-{1,2}.f90 on
an RTX A2000 (sm_86) with driver 510.60.02 and with GOMP_NVPTX_JIT=-O0 I run
into:
...
FAIL: libgomp.fortran/examples-4/declare_target-1.f90 -O0 \
   -DGOMP_NVPTX_JIT=-O0 execution test
FAIL: libgomp.fortran/examples-4/declare_target-2.f90 -O0 \
   -DGOMP_NVPTX_JIT=-O0 execution test
...

Fix this by further limiting recursion depth in the test-cases for nvptx.

Furthermore, make the recursion depth limiting nvptx-specific.


Careful:

--- a/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-1.f90
+++ b/libgomp/testsuite/libgomp.fortran/examples-4/declare_target-1.f90
@@ -1,4 +1,16 @@
  ! { dg-do run }
+! { dg-additional-options "-cpp" }
+! Reduced from 25 to 23, otherwise execution runs out of thread stack on
+! Nvidia Titan V.
+! Reduced from 23 to 22, otherwise execution runs out of thread stack on
+! Nvidia T400 (2GB variant), when run with GOMP_NVPTX_JIT=-O0.
+! Reduced from 22 to 20, otherwise execution runs out of thread stack on
+! Nvidia RTX A2000 (6GB variant), when run with GOMP_NVPTX_JIT=-O0.
+! { dg-additional-options "-DREC_DEPTH=20" { target { offload_target_nvptx } } 
} */


'offload_target_nvptx' doesn't mean that offloading execution is done on
nvptx, but rather that we're "*compiling* for offload target nvptx"
(emphasis mine).  That means, with such a change we're now getting
different behavior in a system with an AMD GPU, when using a toolchain
that only has GCN offloading configured vs. a toolchain that has GCN and
nvptx offloading configured.  This isn't going to cause any real
problems, of course, but it's confusing, and a bad example of
'offload_target_nvptx'.

'offload_device_nvptx' ought to work: "using nvptx offload device".


Thanks for pointing that out.

I tried to understand this multiple offloading configuration a bit, andcame up with the following mental model: it's possible to have a hostwith say an nvptx and amd offloading device, and then configure andbuild a toolchain that can generate a single executable that can offloadto either device, depending on the value of appropriate openacc/openmpenvironment variables.

So, in principle the libgomp testsuite could have a mode in which itdoes that: run the same executable twice, once for each offloadingdevice. In that case, even using offload_device_nvptx would not beaccurate enough, and we'd need to test for offload device type atruntime, as used to be done inlibgomp/testsuite/libgomp.fortran/task-detach-6.f90.

I've tried to copy that setup tolibgomp/testsuite/libgomp.fortran/examples-4/declare_target-1.f90, butthat doesn't seem to work anymore. I've also tried copying thattest-case tolibgomp/testsuite/libgomp.fortran/copy-of-declare_target-1.f90 to ruleout any subdir-related problems, but no luck there either.

Attached is that copy approach, could you try it out and see if it worksfor you?


Do you perhaps have an idea why it's failing?

I can make a patch using offload_device_nvptx, but I'd prefer tounderstand first why the approach above isn't working.


Thanks,
- Tom

[libgomp/testsuite] Add libgomp.fortran/copy-of-declare_target-1.f90

---
 .../libgomp.fortran/copy-of-declare_target-1.f90   | 49 ++++++++++++++++++++++
 1 file changed, 49 insertions(+)

diff --git a/libgomp/testsuite/libgomp.fortran/copy-of-declare_target-1.f90 b/libgomp/testsuite/libgomp.fortran/copy-of-declare_target-1.f90
new file mode 100644
index 00000000000..6dcf5312070
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/copy-of-declare_target-1.f90
@@ -0,0 +1,49 @@
+! { dg-do run }
+! { dg-additional-sources on_device_arch.c }
+
+module e_53_1_mod
+  integer :: THRESHOLD = 20
+contains
+  integer recursive function fib (n) result (f)
+    !$omp declare target
+    integer :: n
+    if (n <= 0) then
+      f = 0
+    else if (n == 1) then
+      f = 1
+    else
+      f = fib (n - 1) + fib (n - 2)
+    end if
+  end function
+
+  integer function fib_wrapper (n)
+    integer :: x
+    !$omp target map(to: n) map(from: x) if(n > THRESHOLD)
+      x = fib (n)
+    !$omp end target
+    fib_wrapper = x
+  end function
+end module
+
+program e_53_1
+  use e_53_1_mod, only : fib, fib_wrapper
+  integer :: REC_DEPTH = 25
+
+  interface
+    integer function on_device_arch_nvptx() bind(C)
+    end function on_device_arch_nvptx
+  end interface
+
+  if (on_device_arch_nvptx () /= 0) then
+     ! Reduced from 25 to 23, otherwise execution runs out of thread stack on
+     ! Nvidia Titan V.
+     ! Reduced from 23 to 22, otherwise execution runs out of thread stack on
+     ! Nvidia T400 (2GB variant), when run with GOMP_NVPTX_JIT=-O0.
+     ! Reduced from 22 to 20, otherwise execution runs out of thread stack on
+     ! Nvidia RTX A2000 (6GB variant), when run with GOMP_NVPTX_JIT=-O0.
+     REC_DEPTH = 20
+  end if
+
+  if (fib (15) /= fib_wrapper (15)) stop 1
+  if (fib (REC_DEPTH) /= fib_wrapper (REC_DEPTH)) stop 2
+end program

Re: [PATCH][libgomp, testsuite, nvptx] Limit recursion in declare_target-{1,2}.f90

Reply via email to