On 22/08/2024 19:26, Tobias Burnus wrote:
This patch adds OpenMP's interop support to the libgomp plugins (nvptx:
cuda, cuda_driver, hip; gcn: hip, hsa).*
[The idea is that the user can ask OpenMP to return a foreign-runtime
handle (CUdevice, hipCtx_t, …) for to a specified OpenMP device number –
and to create a stream (CUstream, hipStream_t, cudaStream_t,
hsa_queue_t), where OpenMP can take care of dependencies, .e.g, via the
'depend' clause.]
The attached patch comes on top of the interop routine patch,
https://gcc.gnu.org/pipermail/gcc-patches/2024-August/661118.html (and
the associated .texi patch,
https://gcc.gnu.org/pipermail/gcc-patches/2024-August/661072.html ).
The patch is more a WIP/RFC patch than a final patch as it is currently
not wired up: while 'GOMP_interop' can be called manually, the proper
way will be OpenMP's 'interop' directive, currently unimplemented.
Hence, this patch is not extensively tested, does not include testcases,
and target.c's GOMP_interop will surely change to handle all clauses.
But except that target.c's GOMP_interop will change, the rest of the
patch should be be rather solid – and could in principle be applied.
Therefore:
(A) Any comments, suggestions regarding the patch in general and in
particular the plugin/ related parts?
The code all looks pretty reasonable to me.
The header file conditional includes worry me though: it is adding
complexity in a way that hurts maintainability, and looks like it might
break somebody's hypothetical out-of-tree plugin. Is it not better for a
plugin that supports interop to include omp.h itself?
(B) RFC: The *stream* *creation* (hsa_queue_t, cudaStream_t/hipStream_t)
functions have tons of options. Thus:
(i) Does the chosen size/flags argument for the stream/queue generation
for GCN/HIP/CUDA make sense? – Or are other values that are more sensible?
I think we want to follow the principle of least surprise, so max size
queues with the same type we normally use, and certainly no non-blocking.
(ii) Should the user be able to tweak the values?
I mean, the user could say:** 'prefer_type({fr("cuda"),
attr("ompx_priority:-2,ompx_non_blocking")},{fr("hsa"),attr("ompx_queue_size:64"})'.
Do we want to permit this? If yes, which of the values should be
changeable?
Is there any prior art for this? It looks like it could be added in
future, without breaking backward compatibility, so I say "no" (at least
for now).
Tobias
(*) For Nvidia, HIP is just a thin wrapper of defines, typedefs and
inline functions around CUDA. Thus, hip, cuda and cuda_driver are
effectively all the same. / The HSA is a new proposal that is currently
added additional-definition document. (OpenMP spec Issue #4023.)
(**) The used syntax and in particular 'attr' are new in OpenMP 6.0 (new
in TR13). Note that attr only takes string literals [while 'fr' takes
strings and (6.0) identifiers ["omp_ifr_cuda"] or constant integer
expressions (5.1)].