Andrew: Does the GCN change look okay to you? This patch permits to use GCN devices with 'omp requires unified_address' which in principle works already, except that the requirement handling did disable it.
(It also updates libgomp.texi for this change and likewise for an older likewise nvptx change.) I will later add a testcase → https://gcc.gnu.org/PR109837 However, the patch was tested with the respective sollve_vv testcase with an additional fix applied on top → https://github.com/SOLLVE/sollve_vv/pull/737 (I do note that with the USM patches for OG12/OG13, unified_address is accepted, cf. OG13 https://gcc.gnu.org/g:3ddf3565faee70e8c910d90ab0c80e71813a0ba1 , but USM itself goes much beyond what we need here.) Tobias ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
libgomp: plugin-gcn - support 'unified_address' Effectively, for GCN (as for nvptx) there is a common address space between host and device, whether being accessible or not. Thus, this commit permits to use 'omp requires unified_address' with GCN devices. (nvptx accepts this requirement since r13-3460-g131d18e928a3ea.) libgomp/ * plugin/plugin-gcn.c (GOMP_OFFLOAD_get_num_devices): Regard unified_address requirement as supported. * libgomp.texi (OpenMP 5.0, AMD Radeon, nvptx): Remove 'unified_address' from the not-supported requirements. libgomp/libgomp.texi | 9 ++++----- libgomp/plugin/plugin-gcn.c | 4 +++- 2 files changed, 7 insertions(+), 6 deletions(-) diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi index 76c56a73969..a3d370a0fb3 100644 --- a/libgomp/libgomp.texi +++ b/libgomp/libgomp.texi @@ -192,8 +192,7 @@ The OpenMP 4.5 specification is fully supported. env variable @tab Y @tab @item Nested-parallel changes to @emph{max-active-levels-var} ICV @tab Y @tab @item @code{requires} directive @tab P - @tab complete but no non-host devices provides @code{unified_address} or - @code{unified_shared_memory} + @tab complete but no non-host devices provides @code{unified_shared_memory} @item @code{teams} construct outside an enclosing target region @tab Y @tab @item Non-rectangular loop nests @tab P @tab Full support for C/C++, partial for Fortran @item @code{!=} as relational-op in canonical loop form for C/C++ @tab Y @tab @@ -4460,7 +4459,7 @@ The implementation remark: @code{device(ancestor:1)}) are processed serially per @code{target} region such that the next reverse offload region is only executed after the previous one returned. -@item OpenMP code that has a requires directive with @code{unified_address} or +@item OpenMP code that has a @code{requires} directive with @code{unified_shared_memory} will remove any GCN device from the list of available devices (``host fallback''). @item The available stack size can be changed using the @code{GCN_STACK_SIZE} @@ -4522,8 +4521,8 @@ The implementation remark: Per device, reverse offload regions are processed serially such that the next reverse offload region is only executed after the previous one returned. -@item OpenMP code that has a requires directive with @code{unified_address} - or @code{unified_shared_memory} will remove any nvptx device from the +@item OpenMP code that has a @code{requires} directive with + @code{unified_shared_memory} will remove any nvptx device from the list of available devices (``host fallback''). @end itemize diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin-gcn.c index 2181bf0235f..ef22d48da79 100644 --- a/libgomp/plugin/plugin-gcn.c +++ b/libgomp/plugin/plugin-gcn.c @@ -3231,7 +3231,9 @@ GOMP_OFFLOAD_get_num_devices (unsigned int omp_requires_mask) /* Return -1 if no omp_requires_mask cannot be fulfilled but devices were present. */ if (hsa_context.agent_count > 0 - && (omp_requires_mask & ~GOMP_REQUIRES_REVERSE_OFFLOAD) != 0) + && ((omp_requires_mask + & ~(GOMP_REQUIRES_UNIFIED_ADDRESS + | GOMP_REQUIRES_REVERSE_OFFLOAD)) != 0)) return -1; return hsa_context.agent_count; }