Updated patch:
* I noticed that the API functions in omp.h.in (and since OpenMP 6.0)
take omp_interop_rc_t* not int*.
Thus, I updated it to match omp.h.in. Unfortunately, the difference
matters for C++; the enum itself is available already in 5.1 and C does
not care.
* I now use -- as suggested below -- a normal, non-typewriter font for
the value, except for the first one which is an identifier (enum/parameter).
* I use now an example, illustrating how to obtain the value.
* * *
Tobias Burnus wrote:
For the string-valued constants in the table, please include the
quotes, unless those are identifiers instead of string literal.
Except for the first that is an identifier to a named constant
(parameter/enum value), all others are the value returned by the API
function, i.e. 11 is the integral value returned by omp_get_interop_int
(...). And ‘amd’ is the value the string has.
Thus, except for the first one, they actually do not need to be use the
typewriter font but could be also a plain 11, nvidia or ``nvidia''.
Actually, the same is kind of true for the property names, but I guess
with the underscores it might be nicer to keep using @code (code wise
and display wise).
Thoughts to this version? I think it is now better.
Tobias
libgomp.texi: Document supported OpenMP 'interop' types for nvptx and gcn
Note that this commit also updates the API interface to OpenMP 6.0;
while 5.1 and 5.2 use 'int *' for the the ret_code argument,
OpenMP 6.0 changed this to omp_interop_rc_t *; this enum also exists in
OpenMP 5.1. However, C++ does not like this change such that unless NULL
is passed (i.e. the argument is ignored), OpenMP 5.x and 6.x are not
compatible.
Note that GCC's omp.h already follows OpenMP 6.0 and is now in sync with
the documentation.
libgomp/ChangeLog:
* libgomp.texi (OpenMP 5.1): Add @ref to offload-target specifics
for 'interop'.
(OpenMP 6.0): Mark dispatch's interop clause as implemented.
(omp_get_interop_int, omp_get_interop_str,
omp_get_interop_ptr, omp_get_interop_type_desc): Add @ref to
Offload-Target Specifics; change ret_code argument type to
'omp_interop_rc_t *'.
(Offload-Target Specifics): Document the supported OpenMP
interop foreign runtimes on AMD and Nvidia GPUs.
libgomp/libgomp.texi | 170 ++++++++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 161 insertions(+), 9 deletions(-)
diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index d1cf9be47ca..ad3649f8536 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -314,7 +314,7 @@ The OpenMP 4.5 specification is fully supported.
clauses @tab N @tab
@item Indirect calls to the device version of a procedure or function in
@code{target} regions @tab Y @tab
-@item @code{interop} directive @tab N @tab
+@item @code{interop} directive @tab Y @tab Cf. @ref{Offload-Target Specifics}
@item @code{omp_interop_t} object support in runtime routines @tab Y @tab
@item @code{nowait} clause in @code{taskwait} directive @tab Y @tab
@item Extensions to the @code{atomic} directive @tab Y @tab
@@ -545,7 +545,7 @@ to address of matching mapped list item per 5.1, Sect. 2.21.7.2 @tab N @tab
@tab N @tab
@item Semicolon-separated list to @code{uses_allocators} @tab N @tab
@item New @code{need_device_addr} modifier to @code{adjust_args} clause @tab N @tab
-@item @code{interop} clause to @code{dispatch} @tab N @tab
+@item @code{interop} clause to @code{dispatch} @tab Y @tab
@item Scope requirement changes for @code{declare_target} @tab N @tab
@item @code{message} and @code{severity} clauses to @code{parallel} directive
@tab N @tab
@@ -3048,7 +3048,7 @@ the initial device is unspecified.
@item @emph{C/C++}:
@multitable @columnfractions .20 .80
@item @emph{Prototype}: @tab @code{omp_intptr_t omp_get_interop_int(const omp_interop_t interop,
- omp_interop_property_t property_id, int *ret_code)}
+ omp_interop_property_t property_id, omp_interop_rc_t *ret_code)}
@end multitable
@item @emph{Fortran}:
@@ -3062,7 +3062,8 @@ the initial device is unspecified.
@end multitable
@item @emph{See also}:
-@ref{omp_get_interop_ptr}, @ref{omp_get_interop_str}, @ref{omp_get_interop_rc_desc}
+@ref{omp_get_interop_ptr}, @ref{omp_get_interop_str}, @ref{omp_get_interop_rc_desc},
+@ref{Offload-Target Specifics}
@item @emph{Reference}:
@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.12.2,
@@ -3093,7 +3094,7 @@ the initial device is unspecified.
@item @emph{C/C++}:
@multitable @columnfractions .20 .80
@item @emph{Prototype}: @tab @code{void *omp_get_interop_ptr(const omp_interop_t interop,
- omp_interop_property_t property_id, int *ret_code)}
+ omp_interop_property_t property_id, omp_interop_rc_t *ret_code)}
@end multitable
@item @emph{Fortran}:
@@ -3107,7 +3108,8 @@ the initial device is unspecified.
@end multitable
@item @emph{See also}:
-@ref{omp_get_interop_int}, @ref{omp_get_interop_str}, @ref{omp_get_interop_rc_desc}
+@ref{omp_get_interop_int}, @ref{omp_get_interop_str}, @ref{omp_get_interop_rc_desc},
+@ref{Offload-Target Specifics}
@item @emph{Reference}:
@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.12.3,
@@ -3137,7 +3139,7 @@ the initial device is unspecified.
@item @emph{C/C++}:
@multitable @columnfractions .20 .80
@item @emph{Prototype}: @tab @code{const char *omp_get_interop_str(const omp_interop_t interop,
- omp_interop_property_t property_id, int *ret_code)}
+ omp_interop_property_t property_id, omp_interop_rc_t *ret_code)}
@end multitable
@item @emph{Fortran}:
@@ -3151,7 +3153,8 @@ the initial device is unspecified.
@end multitable
@item @emph{See also}:
-@ref{omp_get_interop_int}, @ref{omp_get_interop_ptr}, @ref{omp_get_interop_rc_desc}
+@ref{omp_get_interop_int}, @ref{omp_get_interop_ptr}, @ref{omp_get_interop_rc_desc},
+@ref{Offload-Target Specifics}
@item @emph{Reference}:
@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.12.4,
@@ -3234,7 +3237,8 @@ a null pointer is returned. The effect of running this routine in a
@end multitable
@item @emph{See also}:
-@ref{omp_get_num_interop_properties}, @ref{omp_get_interop_name}
+@ref{omp_get_num_interop_properties}, @ref{omp_get_interop_name},
+@ref{Offload-Target Specifics}
@item @emph{Reference}:
@uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.12.6,
@@ -6837,6 +6841,10 @@ The following sections present notes on the offload-target specifics
@node AMD Radeon
@section AMD Radeon (GCN)
+@menu
+* Foreign-runtime support for AMD GPUs::
+@end menu
+
On the hardware side, there is the hierarchy (fine to coarse):
@itemize
@item work item (thread)
@@ -6912,10 +6920,75 @@ The implementation remark:
@end itemize
+@node Foreign-runtime support for AMD GPUs
+@subsection OpenMP @code{interop} -- Foreign-Runtime Support for AMD GPUs
+
+On AMD GPUs, the foreign runtimes are HIP (C++ Heterogeneous-Compute Interface
+for Portability) and HSA (Heterogeneous System Architecture),
+where HIP is the default. The interop object is created using OpenMP's
+@code{interop} directive or, implicitly, when invoking a @code{declare variant}
+procedure that has the @code{append_args} clause. In either case, the
+@code{prefer_type} modifier determines whether HIP or HSA is used.
+
+When specifying the @code{targetsync} modifier: For HIP, a stream is
+created using @code{hipStreamCreate}. For HSA, a queue is created of type
+@code{HSA_QUEUE_TYPE_MULTI} with a queue size of 64.
+
+Invoke the @ref{Interoperability Routines} on an interop object to obtain
+the following properties. For properties with integral (int), pointer (ptr),
+or string (str) data type, call @code{omp_get_interop_int},
+@code{omp_get_interop_ptr}, or @code{omp_get_interop_str}, respectively.
+Note that @code{device_num} is the OpenMP device number
+while @code{device} is the HIP device number or HSA device handle.
+
+For the API routine call, add the prefix @code{omp_ipr_} to the property name;
+for instance:
+@smallexample
+omp_interop_rc_t ret;
+int device_num = omp_get_interop_int (my_interop_obj, omp_ipr_device_num, &ret);
+@end smallexample
+
+@noindent
+Available properties for an HIP interop object:
+
+@multitable @columnfractions .20 .35 .20 .20
+@headitem Property @tab C data type @tab API routine @tab value (if constant)
+@item @code{fr_id} @tab @code{omp_interop_fr_t} @tab int @tab @code{omp_fr_hip}
+@item @code{fr_name} @tab @code{const char *} @tab str @tab ``hip''
+@item @code{vendor} @tab @code{int} @tab int @tab 1
+@item @code{vendor_name} @tab @code{const char *} @tab str @tab ``amd''
+@item @code{device_num} @tab @code{int} @tab int @tab
+@item @code{platform} @tab N/A @tab @tab
+@item @code{device} @tab @code{hipDevice_t} @tab int @tab
+@item @code{device_context} @tab @code{hipCtx_t} @tab ptr @tab
+@item @code{targetsync} @tab @code{hipStream_t} @tab ptr @tab
+@end multitable
+
+@noindent
+Available properties for an HSA interop object:
+
+@multitable @columnfractions .20 .35 .20 .20
+@headitem Property @tab C data type @tab API routine @tab value (if constant)
+@item @code{fr_id} @tab @code{omp_interop_fr_t} @tab int @tab @code{omp_fr_hsa}
+@item @code{fr_name} @tab @code{const char *} @tab str @tab ``hsa''
+@item @code{vendor} @tab @code{int} @tab int @tab 1
+@item @code{vendor_name} @tab @code{const char *} @tab str @tab ``amd''
+@item @code{device_num} @tab @code{int} @tab int @tab
+@item @code{platform} @tab N/A @tab @tab
+@item @code{device} @tab @code{hsa_agent *} @tab ptr @tab
+@item @code{device_context} @tab N/A @tab @tab
+@item @code{targetsync} @tab @code{hsa_queue *} @tab ptr @tab
+@end multitable
+
+
@node nvptx
@section nvptx
+@menu
+* Foreign-runtime support for Nvidia GPUs::
+@end menu
+
On the hardware side, there is the hierarchy (fine to coarse):
@itemize
@item thread
@@ -7008,6 +7081,85 @@ The implementation remark:
@end itemize
+@node Foreign-runtime support for Nvidia GPUs
+@subsection OpenMP @code{interop} -- Foreign-Runtime Support for Nvidia GPUs
+
+On Nvidia GPUs, the foreign runtimes APIs are the CUDA runtime API, the CUDA
+driver API, and HIP, the C++ Heterogeneous-Compute Interface for Portability
+that is---on CUDA-based systems---a very thin layer on top of the CUDA API. By
+default, CUDA is used. The interop object is created using OpenMP's
+@code{interop} directive or, implicitly, when invoking a @code{declare variant}
+procedure that has the @code{append_args} clause. In either case, the
+@code{prefer_type} modifier determines whether CUDA, CUDA driver, or HSA is
+used.
+
+When specifying the @code{targetsync} modifier, a CUDA stream is created using
+the @code{CU_STREAM_DEFAULT} flag.
+
+Invoke the @ref{Interoperability Routines} on an interop object to obtain
+the following properties. For properties with integral (int), pointer (ptr),
+or string (str) data type, call @code{omp_get_interop_int},
+@code{omp_get_interop_ptr}, or @code{omp_get_interop_str}, respectively.
+Note that @code{device_num} is the OpenMP device number while @code{device}
+is the CUDA, CUDA Driver, or HIP device number.
+
+For the API routine call, add the prefix @code{omp_ipr_} to the property name;
+for instance:
+@smallexample
+omp_interop_rc_t ret;
+int device_num = omp_get_interop_int (my_interop_obj, omp_ipr_device_num, &ret);
+@end smallexample
+
+@noindent
+Available properties for a CUDA runtime API interop object:
+
+@multitable @columnfractions .20 .35 .20 .20
+@headitem Property @tab C data type @tab API routine @tab value (if constant)
+@item @code{fr_id} @tab @code{omp_interop_fr_t} @tab int @tab @code{omp_fr_cuda}
+@item @code{fr_name} @tab @code{const char *} @tab str @tab ``cuda''
+@item @code{vendor} @tab @code{int} @tab int @tab 11
+@item @code{vendor_name} @tab @code{const char *} @tab str @tab ``nvidia''
+@item @code{device_num} @tab @code{int} @tab int @tab
+@item @code{platform} @tab N/A @tab @tab
+@item @code{device} @tab @code{int} @tab int @tab
+@item @code{device_context} @tab N/A @tab @tab
+@item @code{targetsync} @tab @code{cudaStream_t} @tab ptr @tab
+@end multitable
+
+@noindent
+Available properties for a CUDA driver API interop object:
+
+@multitable @columnfractions .20 .35 .20 .20
+@headitem Property @tab C data type @tab API routine @tab value (if constant)
+@item @code{fr_id} @tab @code{omp_interop_fr_t} @tab int @tab @code{omp_fr_cuda_driver}
+@item @code{fr_name} @tab @code{const char *} @tab str @tab ``cuda_driver''
+@item @code{vendor} @tab @code{int} @tab int @tab 11
+@item @code{vendor_name} @tab @code{const char *} @tab str @tab ``nvidia''
+@item @code{device_num} @tab @code{int} @tab int @tab
+@item @code{platform} @tab N/A @tab @tab
+@item @code{device} @tab @code{CUdevice} @tab int @tab
+@item @code{device_context} @tab @code{CUcontext} @tab ptr @tab
+@item @code{targetsync} @tab @code{CUstream} @tab ptr @tab
+@end multitable
+
+@noindent
+Available properties for an HIP interop object:
+
+@multitable @columnfractions .20 .35 .20 .20
+@headitem Property @tab C data type @tab API routine @tab value (if constant)
+@item @code{fr_id} @tab @code{omp_interop_fr_t} @tab int @tab @code{omp_fr_hip}
+@item @code{fr_name} @tab @code{const char *} @tab str @tab ``hip''
+@item @code{vendor} @tab @code{int} @tab int @tab 11
+@item @code{vendor_name} @tab @code{const char *} @tab str @tab ``nvidia''
+@item @code{device_num} @tab @code{int} @tab int @tab
+@item @code{platform} @tab N/A @tab @tab
+@item @code{device} @tab @code{hipDevice_t} @tab int @tab
+@item @code{device_context} @tab @code{hipCtx_t} @tab ptr @tab
+@item @code{targetsync} @tab @code{hipStream_t} @tab ptr @tab
+@end multitable
+
+
+
@c ---------------------------------------------------------------------
@c The libgomp ABI
@c ---------------------------------------------------------------------