Hi! On 2025-05-07T13:58:38+0200, Tobias Burnus <tbur...@baylibre.com> wrote: > Committed asr16-445-g9565076f9b8105. This test supports mapping + accessing > the vtab > of the polymorphic variable on the host. Obviously, this only works if > the host pointer is device accessible ("unified-shared memory"). In > principle, we want to check for this - and enable some subtests. The > enabling/disabling works, but there is no simple USM check. Currently, > only the host gets the extra tests, the new test also checks it with > devices, but unfortunately the USM requirement unconditionally switches > to self maps. Thus, while it is useful to test this (hence, this > commit), it does not check what we actually want ...
Currently, GCC/OpenMP has '#pragma omp requires unified_shared_memory' equaling '#pragma omp requires self_maps', as you write. I understand that may change in the future. Despite we and/or the respective OpenMP committee still working out what the exact semantic differences are to be, I agree that it's already useful to add "USM" test cases to libgomp. We have currently the following testing run-time configurations: (a) no offloading device available, (b) offloading device available that's not USM-capable, (c) USM-capable offloading device available. To achieve full libgomp OpenMP test coverage, it is understood that two separate 'check-target-libgomp' runs are necessary, for (a) and (b) run-time configurations; given (b) run-time configuration, we're normally not running test cases in (a) run-time configuration (with a few exceptions for specific test cases). (As you know, libgomp OpenACC testing is doing that differently, but that's not the subject of this discussion here.) Now, given (c) run-time configuration (and unless '#pragma omp requires unified_shared_memory' or '#pragma omp requires self_maps' specified) we're normally still running offloading test cases in (b) mode: create data mappings as per OpenMP directives. (..., which also may change in the future, I understand.) Therefore, we shall add (as you've begun) select '*-usm.*' test cases, which '#pragma omp requires unified_shared_memory' and then '#include' the original test case, to test the latter one in USM mode, while also still keeping (b) run-time configuration testing alive. (We can't just add '#pragma omp requires unified_shared_memory' to the original test cases, because that'd lose (b) run-time configuration testing.) I suggest that we restrict such new '*-usm.*' test cases to effective-target 'offload_device_usm': they don't add value if running in (a) run-time configuration (where the '*-usm.*' test case and the respective original test case test the very same thing), and also don't add value in (b) run-time configuration (where for the '*-usm.*' test cases, libgomp then has to resort to host-fallback execution in order to fulfil the USM requirement -- and proper host-fallback execution testing needs a separate test run anyway, for full libgomp OpenMP test coverage, as discussed above. (Please let me know if you don't understand my rationale; I appear to have some difficulties in formulating it coherently...) ;-) Anyway, maybe a patch is easier to read... OK to push the attached "Add effective-target 'offload_device_usm', 'libgomp.c-c++-common/target-usm-1.c'"? > PS: My plans > for the future is to permit more fine tuning and some default changes. > In particular, if a device is an APU, it should by default use self > mapping. Additionally, the user should have the possibility to switch > between mapping and self mapping for devices. Also with a command-line flag and/or environment variable in addition to source-code-level '#pragma omp requires unified_shared_memory' etc., I suppose? > Depending on the access > pattern, way how USM is implemented and location of host vs. device > process, copying (mapping) or direct access (self map) is faster. > Additionally, copying might run into storage-size issues. ACK. Grüße Thomas > commit 9565076f9b810541aeb63cb621d694326aa12216 > Author: Tobias Burnus <tbur...@baylibre.com> > Date: Wed May 7 13:46:51 2025 +0200 > > libgomp.fortran/map-alloc-comp-9{,-usm}.f90: Add unified_shared_memory > variant > > When host memory is device accessible - independent whether mapping is > done or > not (i.e. self map), the 'vtab' pointer becomes accessible, which stores > the > dynamic type's type and size information. > > In principle, we want to test: USM available but mapping is still done, > but > as there is no simple + reliable not-crashing way to test for this, those > checks are skipped in the (pre)existing test file map-alloc-comp-9.f90. > > Or rather: those are only active with self-maps, which is currently only > true > for the host. > > This commit adds map-alloc-comp-9-usm.f90 which runs the same test with > 'omp requires unified_shared_memory'. While OpenMP permits both actual > mapping and self maps with this flag, it in theory covers the missing > cases. > However, currently, GCC always uses self maps with USM. Still, having a > device-run self-maps check is better than nothing, even if it misses the > most interesting case. > > libgomp/ChangeLog: > > * testsuite/libgomp.fortran/map-alloc-comp-9.f90: Process > differently > when USE_USM_REQUIREMENT is set. > * testsuite/libgomp.fortran/map-alloc-comp-9-usm.f90: New test. > --- > .../libgomp.fortran/map-alloc-comp-9-usm.f90 | 11 +++++++++++ > .../testsuite/libgomp.fortran/map-alloc-comp-9.f90 | 19 > +++++++++++++++++++ > 2 files changed, 30 insertions(+) > > diff --git a/libgomp/testsuite/libgomp.fortran/map-alloc-comp-9-usm.f90 > b/libgomp/testsuite/libgomp.fortran/map-alloc-comp-9-usm.f90 > new file mode 100644 > index 00000000000..90378c0e42a > --- /dev/null > +++ b/libgomp/testsuite/libgomp.fortran/map-alloc-comp-9-usm.f90 > @@ -0,0 +1,11 @@ > +! { dg-additional-options "-cpp -DUSE_USM_REQUIREMENT=1 -Wno-openmp" } > +! > +! We silence the warning: > +! Mapping of polymorphic list item '...' is unspecified behavior [-Wopenmp] > +! > +! Ensure that polymorphic mapping is diagnosed as undefined behavior > +! Ensure that static access to polymorphic variables works > + > +! Run map-alloc-comp-9.f90 in unified-shared-memory mode > + > +#include "map-alloc-comp-9.f90" > diff --git a/libgomp/testsuite/libgomp.fortran/map-alloc-comp-9.f90 > b/libgomp/testsuite/libgomp.fortran/map-alloc-comp-9.f90 > index 3cec39218f5..26c73d75c09 100644 > --- a/libgomp/testsuite/libgomp.fortran/map-alloc-comp-9.f90 > +++ b/libgomp/testsuite/libgomp.fortran/map-alloc-comp-9.f90 > @@ -1,8 +1,19 @@ > +! { dg-additional-options "-cpp" } > +! > ! Ensure that polymorphic mapping is diagnosed as undefined behavior > ! Ensure that static access to polymorphic variables works > > +! Some extended tests are only run with shared memory > +! To enforce this (where possible) on the device side: > +! #define USE_USM_REQUIREMENT > +! which is done in map-alloc-comp-9-usm.f90 > + > subroutine test(case) > implicit none(type, external) > +#ifdef USE_USM_REQUIREMENT > + !$omp requires unified_shared_memory > +#endif > + > type t > integer :: x(4) > end type t > @@ -73,10 +84,14 @@ var4%y2(2)%y%x%x = -7 * [1111,2222,3333,4444] > var4%y2(2)%y%x2(1)%x = -8 * [1111,2222,3333,4444] > var4%y2(2)%y%x2(2)%x = -9 * [1111,2222,3333,4444] > > +#ifdef USE_USM_REQUIREMENT > +is_shared_mem = .true. > +#else > is_shared_mem = .false. > !$omp target map(to: is_shared_mem) > is_shared_mem = .true. > !$omp end target > +#endif > > if (case == 1) then > ! implicit mapping > @@ -532,6 +547,10 @@ end subroutine test > program main > use omp_lib > implicit none(type, external) > +#ifdef USE_USM_REQUIREMENT > + !$omp requires unified_shared_memory > +#endif > + > interface > subroutine test(case) > integer, value :: case
>From 46fc59b5cdaa42c4dc9edaee7d52194c1f45b6b3 Mon Sep 17 00:00:00 2001 From: Thomas Schwinge <tschwi...@baylibre.com> Date: Fri, 9 May 2025 15:05:57 +0200 Subject: [PATCH] Add effective-target 'offload_device_usm', 'libgomp.c-c++-common/target-usm-1.c' Also use the new effective-target 'offload_device_usm' for restricting 'libgomp.fortran/map-alloc-comp-9-usm.f90' testing; the latter being a USM variant of 'libgomp.fortran/map-alloc-comp-9.f90'. libgomp/ * testsuite/lib/libgomp.exp (check_effective_target_offload_device_usm): New. * testsuite/libgomp.fortran/map-alloc-comp-9-usm.f90: Use it. * testsuite/libgomp.c-c++-common/target-usm-1.c: New. --- libgomp/testsuite/lib/libgomp.exp | 19 ++++++++ .../libgomp.c-c++-common/target-usm-1.c | 46 +++++++++++++++++++ .../libgomp.fortran/map-alloc-comp-9-usm.f90 | 1 + 3 files changed, 66 insertions(+) create mode 100644 libgomp/testsuite/libgomp.c-c++-common/target-usm-1.c diff --git a/libgomp/testsuite/lib/libgomp.exp b/libgomp/testsuite/lib/libgomp.exp index a620f8c2a09..cd32be1ca68 100644 --- a/libgomp/testsuite/lib/libgomp.exp +++ b/libgomp/testsuite/lib/libgomp.exp @@ -449,6 +449,8 @@ proc check_effective_target_offload_device_nonshared_as { } { } # Return 1 if offload device is available and it has shared address space. +# This doesn't consider whether '#pragma omp requires unified_shared_memory' +# may be used to switch into shared-memory mode. proc check_effective_target_offload_device_shared_as { } { return [check_runtime_nocache offload_device_shared_as { int main () @@ -461,6 +463,23 @@ proc check_effective_target_offload_device_shared_as { } { } ] } +# Return 1 if, with '#pragma omp requires unified_shared_memory' in effect, an +# USM-capable offload device is available (not considering host-fallback +# execution). +proc check_effective_target_offload_device_usm { } { + return [check_runtime_nocache offload_device_usm { + #pragma omp requires unified_shared_memory + #include <omp.h> + int main () + { + int a; + #pragma omp target map(from: a) + a = omp_is_initial_device (); + return a; + } + } ] +} + # Return 1 if using nvptx offload device. proc check_effective_target_offload_device_nvptx { } { return [check_runtime_nocache offload_device_nvptx { diff --git a/libgomp/testsuite/libgomp.c-c++-common/target-usm-1.c b/libgomp/testsuite/libgomp.c-c++-common/target-usm-1.c new file mode 100644 index 00000000000..582f80ece60 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/target-usm-1.c @@ -0,0 +1,46 @@ +/* If we have an offload device that is capable of USM... + { dg-do run { target offload_device_usm } } */ + +/* ..., and we request USM... */ +#pragma omp requires unified_shared_memory +/* (..., which in the GCC implementation equals 'self_maps'...) */ + +#include <stdint.h> +#include <omp.h> + +#define X_init 5 +static int x = X_init; + +int main() +{ + /* ..., then verify that we still have devices to offload to... */ + int num_devices = omp_get_num_devices(); + if (num_devices < 1) + __builtin_abort(); + /* ..., and for each of them (including the host, last)... */ + for (int i = 0; i <= num_devices; ++i) + { + uintptr_t xhp = (uintptr_t) &x, xdp = 0; +#pragma omp target device(device_num: i) defaultmap(tofrom) + { + /* ..., verify that we're actually using it... */ + if (omp_is_initial_device() != (i == num_devices)) + __builtin_abort(); + /* ... in USM mode. */ + xdp = (uintptr_t) &x; + if (xdp != xhp) + __builtin_abort(); + if (x++ != X_init + i) + __builtin_abort(); + int *xp = (int *) xhp; + if (*xp != X_init + i + 1) + __builtin_abort(); + } + if (xdp != xhp) + __builtin_abort(); + if (x != X_init + i + 1) + __builtin_abort(); + } + + return 0; +} diff --git a/libgomp/testsuite/libgomp.fortran/map-alloc-comp-9-usm.f90 b/libgomp/testsuite/libgomp.fortran/map-alloc-comp-9-usm.f90 index 90378c0e42a..a62ada22183 100644 --- a/libgomp/testsuite/libgomp.fortran/map-alloc-comp-9-usm.f90 +++ b/libgomp/testsuite/libgomp.fortran/map-alloc-comp-9-usm.f90 @@ -1,3 +1,4 @@ +! { dg-skip-if {} { ! offload_device_usm } } ! { dg-additional-options "-cpp -DUSE_USM_REQUIREMENT=1 -Wno-openmp" } ! ! We silence the warning: -- 2.34.1