On 22/08/2024 19:26, Tobias Burnus wrote:
This patch adds OpenMP's interop support to the libgomp plugins (nvptx: cuda, cuda_driver, hip; gcn: hip, hsa).*

[The idea is that the user can ask OpenMP to return a foreign-runtime handle (CUdevice, hipCtx_t, …) for to a specified OpenMP device number – and to create a stream (CUstream, hipStream_t, cudaStream_t, hsa_queue_t), where OpenMP can take care of dependencies, .e.g, via the 'depend' clause.]

The attached patch comes on top of the interop routine patch, https://gcc.gnu.org/pipermail/gcc-patches/2024-August/661118.html (and the associated .texi patch, https://gcc.gnu.org/pipermail/gcc-patches/2024-August/661072.html ).

The patch is more a WIP/RFC patch than a final patch as it is currently not wired up: while 'GOMP_interop' can be called manually, the proper way will be OpenMP's 'interop' directive, currently unimplemented. Hence, this patch is not extensively tested, does not include testcases, and target.c's GOMP_interop will surely change to handle all clauses.

But except that target.c's GOMP_interop will change, the rest of the patch should be be rather solid – and could in principle be applied.

Therefore:

(A) Any comments, suggestions regarding the patch in general and in particular the plugin/ related parts?

The code all looks pretty reasonable to me.

The header file conditional includes worry me though: it is adding complexity in a way that hurts maintainability, and looks like it might break somebody's hypothetical out-of-tree plugin. Is it not better for a plugin that supports interop to include omp.h itself?

(B) RFC: The *stream* *creation* (hsa_queue_t, cudaStream_t/hipStream_t) functions have tons of options. Thus:

(i) Does the chosen size/flags argument for the stream/queue generation for GCN/HIP/CUDA make sense? – Or are other values that are more sensible?

I think we want to follow the principle of least surprise, so max size queues with the same type we normally use, and certainly no non-blocking.

(ii) Should the user be able to tweak the values?

I mean, the user could say:** 'prefer_type({fr("cuda"), attr("ompx_priority:-2,ompx_non_blocking")},{fr("hsa"),attr("ompx_queue_size:64"})'.

Do we want to permit this? If yes, which of the values should be changeable?

Is there any prior art for this? It looks like it could be added in future, without breaking backward compatibility, so I say "no" (at least for now).


Tobias

(*) For Nvidia, HIP is just a thin wrapper of defines, typedefs and inline functions around CUDA. Thus, hip, cuda and cuda_driver are effectively all the same. / The HSA is a new proposal that is currently added additional-definition document. (OpenMP spec Issue #4023.)

(**) The used syntax and in particular 'attr' are new in OpenMP 6.0 (new in TR13). Note that attr only takes string literals [while 'fr' takes strings and (6.0) identifiers ["omp_ifr_cuda"] or constant integer expressions (5.1)].

Reply via email to