Maybe it becomes better reviewable with an attached patch ...

On 02.02.23 15:31, Tobias Burnus wrote:
Now that the stack handling has been changed for AMDGCN, this patch
enables reverse offload.
(cf. today's "[committed] amdgcn, libgomp: Manually allocated stacks"
patch email/commit
by Andrew).

Any comments, suggestions?

Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
libgomp: enable reverse offload for AMDGCN

libgomp/ChangeLog:

	* libgomp.texi (5.0 Impl. Status, gcn specifics): Update for
	reverse offload.
	* plugin/plugin-gcn.c (GOMP_OFFLOAD_get_num_devices): Accept
	reverse-offload requirement.

 libgomp/libgomp.texi        | 13 ++++++++-----
 libgomp/plugin/plugin-gcn.c |  3 ++-
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index 1f84b050eb2..698ae330942 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -227,8 +227,7 @@ The OpenMP 4.5 specification is fully supported.
 @item @code{allocate} directive @tab N @tab
 @item @code{allocate} clause @tab P @tab Initial support
 @item @code{use_device_addr} clause on @code{target data} @tab Y @tab
-@item @code{ancestor} modifier on @code{device} clause
-      @tab Y @tab Host fallback with GCN devices
+@item @code{ancestor} modifier on @code{device} clause @tab Y @tab
 @item Implicit declare target directive @tab Y @tab
 @item Discontiguous array section with @code{target update} construct
       @tab N @tab
@@ -4455,9 +4454,13 @@ The implementation remark:
 @item I/O within OpenMP target regions and OpenACC parallel/kernels is supported
       using the C library @code{printf} functions and the Fortran
       @code{print}/@code{write} statements.
-@item OpenMP code that has a requires directive with @code{unified_address},
-      @code{unified_shared_memory} or @code{reverse_offload} will remove
-      any GCN device from the list of available devices (``host fallback'').
+@item Reverse offload (i.e. @code{target} regions with
+      @code{device(ancestor:1)}) are processed serially per @code{target} region
+      such that the next reverse offload region is only executed after the previous
+      one returned.
+@item OpenMP code that has a requires directive with @code{unified_address} or
+      @code{unified_shared_memory} will remove any GCN device from the list of
+      available devices (``host fallback'').
 @end itemize
 
 
diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin-gcn.c
index a7b35059ab3..11ce6b0fa8d 100644
--- a/libgomp/plugin/plugin-gcn.c
+++ b/libgomp/plugin/plugin-gcn.c
@@ -3262,7 +3262,8 @@ GOMP_OFFLOAD_get_num_devices (unsigned int omp_requires_mask)
     return 0;
   /* Return -1 if no omp_requires_mask cannot be fulfilled but
      devices were present.  */
-  if (hsa_context.agent_count > 0 && omp_requires_mask != 0)
+  if (hsa_context.agent_count > 0
+      && (omp_requires_mask & ~GOMP_REQUIRES_REVERSE_OFFLOAD) != 0)
     return -1;
   return hsa_context.agent_count;
 }

Reply via email to