This patch adds OpenMP's interop support to the libgomp plugins (nvptx: cuda, cuda_driver, hip; gcn: hip, hsa).*

[The idea is that the user can ask OpenMP to return a foreign-runtime handle (CUdevice, hipCtx_t, …) for to a specified OpenMP device number – and to create a stream (CUstream, hipStream_t, cudaStream_t, hsa_queue_t), where OpenMP can take care of dependencies, .e.g, via the 'depend' clause.]

The attached patch comes on top of the interop routine patch, https://gcc.gnu.org/pipermail/gcc-patches/2024-August/661118.html (and the associated .texi patch, https://gcc.gnu.org/pipermail/gcc-patches/2024-August/661072.html ).

The patch is more a WIP/RFC patch than a final patch as it is currently not wired up: while 'GOMP_interop' can be called manually, the proper way will be OpenMP's 'interop' directive, currently unimplemented. Hence, this patch is not extensively tested, does not include testcases, and target.c's GOMP_interop will surely change to handle all clauses.

But except that target.c's GOMP_interop will change, the rest of the patch should be be rather solid – and could in principle be applied.

Therefore:

(A) Any comments, suggestions regarding the patch in general and in particular the plugin/ related parts?

(B) RFC: The *stream* *creation* (hsa_queue_t, cudaStream_t/hipStream_t) functions have tons of options. Thus:

(i) Does the chosen size/flags argument for the stream/queue generation for GCN/HIP/CUDA make sense? – Or are other values that are more sensible?

(ii) Should the user be able to tweak the values?

I mean, the user could say:** 'prefer_type({fr("cuda"), attr("ompx_priority:-2,ompx_non_blocking")},{fr("hsa"),attr("ompx_queue_size:64"})'.

Do we want to permit this? If yes, which of the values should be changeable?

Tobias

(*) For Nvidia, HIP is just a thin wrapper of defines, typedefs and inline functions around CUDA. Thus, hip, cuda and cuda_driver are effectively all the same. / The HSA is a new proposal that is currently added additional-definition document. (OpenMP spec Issue #4023.)

(**) The used syntax and in particular 'attr' are new in OpenMP 6.0 (new in TR13). Note that attr only takes string literals [while 'fr' takes strings and (6.0) identifiers ["omp_ifr_cuda"] or constant integer expressions (5.1)].
libgomp: Add OpenMP interop support to nvptx + gcn plugin

FIXME/NOTE: target.c's GOMP_interop is a stub, sufficient for some initial
testing, but not sufficient to implemement 'omp interop'. However, the
plugin side should be feature complete, except for possible extensions.

This adds interop support to the libgomp plugins; to the gcn one, it adds
HSA and HIP and, to the nvptx one, it adds CUDA, CUDA_DRIVER and HIP.

libgomp/ChangeLog:

	* libgomp-plugin.h: Include 'omp.h.in' if _LIBGOMP_PLUGIN_INCLUDE
	is set; define the following only if _LIBGOMP_OMP_LOCK_DEFINED is
	set (either via libgomp.h or when _LIBGOMP_PLUGIN_INCLUDE is set).
	(struct interop_obj_t): New.
	(GOMP_OFFLOAD_get_interop, GOMP_OFFLOAD_get_interop_int,
	GOMP_OFFLOAD_get_interop_ptr, GOMP_OFFLOAD_get_interop_str,
	GOMP_OFFLOAD_get_interop_type_desc): Add prototype.
	* libgomp.h: Move 'omp.h.in' inclusion to the top. 
	(struct gomp_device_descr): Add function pointers for interop.
	* libgomp.map (GOMP_5.1.3): Add GOMP_interop.
	* libgomp_g.h (GOMP_interop): Add prototype.
	* target.c (GOMP_get_interop): New.
	(omp_get_interop_int, omp_get_interop_ptr, omp_get_interop_str
	omp_get_interop_type_desc): Add calls into the plugin.
	(gomp_load_plugin_for_device): DLSYM_OPT the new plugin functions.
	* plugin/plugin-gcn.c (_LIBGOMP_PLUGIN_INCLUDE):
	(hipError_t, hipCtx_t, hipStream): Add stub typedefs.
	(struct hip_runtime_fn_info): New.
	(struct agent_info): Add hsa_device_num.
	(hip_fns, hip_runtime_lib): New global vars.
	(init_environment_variables): Init hip_runtime_lib.
	(struct agent_id_data_t): New.
	(assign_agent_ids): Use it to set hsa_device_num.
	(init_hsa_context): Update call.
	(init_hip_runtime_functions, GOMP_OFFLOAD_interop,
	GOMP_OFFLOAD_get_interop_int, GOMP_OFFLOAD_get_interop_ptr,
	GOMP_OFFLOAD_get_interop_str, GOMP_OFFLOAD_get_interop_type_desc): New.
	* plugin/plugin-nvptx.c: Define _LIBGOMP_PLUGIN_INCLUDE before
	including libgomp-plugin.h.
	(GOMP_OFFLOAD_interop, GOMP_OFFLOAD_get_interop_int,
	GOMP_OFFLOAD_get_interop_ptr, GOMP_OFFLOAD_get_interop_str,
	GOMP_OFFLOAD_get_interop_type_desc): New.

 libgomp/libgomp-plugin.h      |  37 ++++
 libgomp/libgomp.h             |  17 +-
 libgomp/libgomp.map           |   1 +
 libgomp/libgomp_g.h           |   2 +
 libgomp/plugin/plugin-gcn.c   | 415 +++++++++++++++++++++++++++++++++++++++++-
 libgomp/plugin/plugin-nvptx.c | 282 ++++++++++++++++++++++++++++
 libgomp/target.c              | 134 +++++++++++---
 7 files changed, 848 insertions(+), 40 deletions(-)

diff --git a/libgomp/libgomp-plugin.h b/libgomp/libgomp-plugin.h
index 0c9c28c65cf..ce1a83bc51e 100644
--- a/libgomp/libgomp-plugin.h
+++ b/libgomp/libgomp-plugin.h
@@ -33,6 +33,14 @@
 #include <stddef.h>
 #include <stdint.h>
 
+#ifdef _LIBGOMP_PLUGIN_INCLUDE
+  /* Include 'omp.h' for the interop definitions.  */
+  #define _LIBGOMP_OMP_LOCK_DEFINED 1
+  typedef struct omp_lock_t omp_lock_t;
+  typedef struct omp_nest_lock_t omp_nest_lock_t;
+  #include "omp.h.in"
+#endif
+
 #ifdef __cplusplus
 extern "C" {
 #endif
@@ -101,6 +109,18 @@ struct addr_pair
   uintptr_t end;
 };
 
+
+#ifdef _LIBGOMP_OMP_LOCK_DEFINED
+/* Only define when omp.h.in was included, as in plugin/ and in libgomp.h.   */
+struct interop_obj_t
+{
+  void *stream;
+  void *device_data;
+  omp_interop_fr_t fr;
+  int device_num;
+};
+#endif
+
 /* This following symbol is used to name the target side variable struct that
    holds the designated ICVs of the target device. The symbol needs to be
    available to libgomp code and the offload plugin (which in the latter case
@@ -180,6 +200,23 @@ extern int GOMP_OFFLOAD_openacc_cuda_set_stream (struct goacc_asyncqueue *,
 extern union goacc_property_value
   GOMP_OFFLOAD_openacc_get_property (int, enum goacc_property);
 
+#ifdef _LIBGOMP_OMP_LOCK_DEFINED
+/* Only define when omp.h.in was included, as in plugin/ and in libgomp.h.   */
+extern void GOMP_OFFLOAD_interop (struct interop_obj_t *, int,
+				  bool, bool, const char *, const int *);
+extern intptr_t GOMP_OFFLOAD_get_interop_int (struct interop_obj_t *,
+					      omp_interop_property_t,
+					      omp_interop_rc_t *);
+extern void *GOMP_OFFLOAD_get_interop_ptr (struct interop_obj_t *,
+					   omp_interop_property_t,
+					   omp_interop_rc_t *);
+extern const char *GOMP_OFFLOAD_get_interop_str (struct interop_obj_t *obj,
+						 omp_interop_property_t,
+						 omp_interop_rc_t *);
+extern const char *GOMP_OFFLOAD_get_interop_type_desc (struct interop_obj_t *,
+						       omp_interop_property_t);
+#endif
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
index 089393846d1..78a011e98d4 100644
--- a/libgomp/libgomp.h
+++ b/libgomp/libgomp.h
@@ -43,7 +43,14 @@
 
 #include "config.h"
 #include <stdint.h>
+
+/* Include omp.h by parts.  */
+#include "omp-lock.h"
+#define _LIBGOMP_OMP_LOCK_DEFINED 1
+#include "omp.h.in"
+
 #include "libgomp-plugin.h"
+
 #include "gomp-constants.h"
 
 #ifdef HAVE_PTHREAD_H
@@ -1417,6 +1424,11 @@ struct gomp_device_descr
   __typeof (GOMP_OFFLOAD_can_run) *can_run_func;
   __typeof (GOMP_OFFLOAD_run) *run_func;
   __typeof (GOMP_OFFLOAD_async_run) *async_run_func;
+  __typeof (GOMP_OFFLOAD_interop) *get_interop_func;
+  __typeof (GOMP_OFFLOAD_get_interop_int) *get_interop_int_func;
+  __typeof (GOMP_OFFLOAD_get_interop_ptr) *get_interop_ptr_func;
+  __typeof (GOMP_OFFLOAD_get_interop_str) *get_interop_str_func;
+  __typeof (GOMP_OFFLOAD_get_interop_type_desc) *get_interop_type_desc_func;
 
   /* Splay tree containing information about mapped memory regions.  */
   struct splay_tree_s mem_map;
@@ -1499,11 +1511,6 @@ gomp_work_share_init_done (void)
 /* Now that we're back to default visibility, include the globals.  */
 #include "libgomp_g.h"
 
-/* Include omp.h by parts.  */
-#include "omp-lock.h"
-#define _LIBGOMP_OMP_LOCK_DEFINED 1
-#include "omp.h.in"
-
 #if !defined (HAVE_ATTRIBUTE_VISIBILITY) \
     || !defined (HAVE_ATTRIBUTE_ALIAS) \
     || !defined (HAVE_AS_SYMVER_DIRECTIVE) \
diff --git a/libgomp/libgomp.map b/libgomp/libgomp.map
index 7c2345eb29b..3dce1ebc72f 100644
--- a/libgomp/libgomp.map
+++ b/libgomp/libgomp.map
@@ -430,6 +430,7 @@ GOMP_5.1.2 {
 
 GOMP_5.1.3 {
   global:
+	GOMP_interop;
 	omp_get_num_interop_properties;
 	omp_get_interop_int;
 	omp_get_interop_ptr;
diff --git a/libgomp/libgomp_g.h b/libgomp/libgomp_g.h
index c0cc03ae61f..65e51aa2f0d 100644
--- a/libgomp/libgomp_g.h
+++ b/libgomp/libgomp_g.h
@@ -358,6 +358,8 @@ extern void GOMP_target_enter_exit_data (int, size_t, void **, size_t *,
 extern void GOMP_teams (unsigned int, unsigned int);
 extern bool GOMP_teams4 (unsigned int, unsigned int, unsigned int, bool);
 extern void *GOMP_target_map_indirect_ptr (void *);
+extern void GOMP_interop (void *, int, bool, bool, const char *,
+			  const int *);
 
 /* teams.c */
 
diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin-gcn.c
index 3d882b5ab63..c18de6cc51c 100644
--- a/libgomp/plugin/plugin-gcn.c
+++ b/libgomp/plugin/plugin-gcn.c
@@ -41,7 +41,9 @@
 #include <hsa_ext_amd.h>
 #include <dlfcn.h>
 #include <signal.h>
+#define _LIBGOMP_PLUGIN_INCLUDE 1
 #include "libgomp-plugin.h"
+#undef _LIBGOMP_PLUGIN_INCLUDE
 #include "config/gcn/libgomp-gcn.h"  /* For struct output.  */
 #include "gomp-constants.h"
 #include <elf.h>
@@ -208,6 +210,24 @@ struct hsa_runtime_fn_info
      const hsa_signal_t *dep_signals, hsa_signal_t completion_signal);
 };
 
+/* As an HIP runtime is dlopened, following structure defines function
+   pointers utilized by the interop feature of this plugin.
+   Add suffient type declarations to get this work.  */
+
+typedef int hipError_t;  /* Actually an enum; 0 == success. */
+typedef void* hipCtx_t;
+struct hipStream_s;
+typedef struct hipStream_s* hipStream_t;
+
+struct hip_runtime_fn_info
+{
+  hipError_t (*hipStreamCreate_fn) (hipStream_t *);
+  hipError_t (*hipStreamDestroy_fn) (hipStream_t *);
+  hipError_t (*hipCtxGetCurrent_fn) (hipCtx_t *ctx);
+  hipError_t (*hipSetDevice_fn) (int deviceId);
+  hipError_t (*hipGetDevice_fn) (int *deviceId);
+};
+
 /* Structure describing the run-time and grid properties of an HSA kernel
    lauch.  This needs to match the format passed to GOMP_OFFLOAD_run.  */
 
@@ -407,8 +427,10 @@ struct agent_info
 {
   /* The HSA ID of the agent.  Assigned when hsa_context is initialized.  */
   hsa_agent_t id;
-  /* The user-visible device number.  */
+  /* The user-visible device number; only includes GPU devices.  */
   int device_id;
+  /* The HSA device number; also includes the CPUs. */
+  int hsa_device_num;
   /* Whether the agent has been initialized.  The fields below are usable only
      if it has been.  */
   bool initialized;
@@ -552,9 +574,11 @@ struct hsa_context_info
 static struct hsa_context_info hsa_context;
 
 /* HSA runtime functions that are initialized in init_hsa_context.  */
-
 static struct hsa_runtime_fn_info hsa_fns;
 
+/* HIP runtime functions that are initialized in init_hip_runtime_functions.  */
+static struct hip_runtime_fn_info hip_fns;
+
 /* Heap space, allocated target-side, provided for use of newlib malloc.
    Each module should have it's own heap allocated.
    Beware that heap usage increases with OpenMP teams.  See also arenas.  */
@@ -577,10 +601,11 @@ static bool debug;
 
 static bool suppress_host_fallback;
 
-/* Flag to locate HSA runtime shared library that is dlopened
+/* Flag to locate HSA and HIP runtime shared libraries that is dlopened
    by this plug-in.  */
 
 static const char *hsa_runtime_lib;
+static const char *hip_runtime_lib;
 
 /* Flag to decide if the runtime should support also CPU devices (can be
    a simulator).  */
@@ -1067,6 +1092,10 @@ init_environment_variables (void)
   if (hsa_runtime_lib == NULL)
     hsa_runtime_lib = "libhsa-runtime64.so.1";
 
+  hip_runtime_lib = secure_getenv ("HIP_RUNTIME_LIB");
+  if (hip_runtime_lib == NULL)
+    hip_runtime_lib = "libamdhip64.so";
+
   support_cpu_devices = secure_getenv ("GCN_SUPPORT_CPU_DEVICES");
 
   const char *x = secure_getenv ("GCN_NUM_TEAMS");
@@ -1497,6 +1526,12 @@ count_gpu_agents (hsa_agent_t agent, void *data __attribute__ ((unused)))
   return HSA_STATUS_SUCCESS;
 }
 
+struct agent_id_data_t
+{
+  int agent_index;
+  int hsa_device_num;
+};
+
 /* Callback of hsa_iterate_agents; if AGENT is a GPU device, assign the agent
    id to the describing structure in the hsa context.  The index of the
    structure is pointed to by DATA, increment it afterwards.  */
@@ -1504,11 +1539,13 @@ count_gpu_agents (hsa_agent_t agent, void *data __attribute__ ((unused)))
 static hsa_status_t
 assign_agent_ids (hsa_agent_t agent, void *data)
 {
+  struct agent_id_data_t *d = (struct agent_id_data_t *) data;
+  ++d->hsa_device_num;
   if (suitable_hsa_agent_p (agent))
     {
-      int *agent_index = (int *) data;
-      hsa_context.agents[*agent_index].id = agent;
-      ++*agent_index;
+      hsa_context.agents[d->agent_index].id = agent;
+      hsa_context.agents[d->agent_index].hsa_device_num = d->hsa_device_num;
+      ++d->agent_index;
     }
   return HSA_STATUS_SUCCESS;
 }
@@ -1522,7 +1559,7 @@ static bool
 init_hsa_context (bool probe)
 {
   hsa_status_t status;
-  int agent_index = 0;
+  struct agent_id_data_t agent_id_data = {};
 
   if (hsa_context.initialized)
     return true;
@@ -1552,10 +1589,10 @@ init_hsa_context (bool probe)
   hsa_context.agents
     = GOMP_PLUGIN_malloc_cleared (hsa_context.agent_count
 				  * sizeof (struct agent_info));
-  status = hsa_fns.hsa_iterate_agents_fn (assign_agent_ids, &agent_index);
+  status = hsa_fns.hsa_iterate_agents_fn (assign_agent_ids, &agent_id_data);
   if (status != HSA_STATUS_SUCCESS)
     return hsa_error ("Scanning compute agents failed", status);
-  if (agent_index != hsa_context.agent_count)
+  if (agent_id_data.agent_index != hsa_context.agent_count)
     {
       GOMP_PLUGIN_error ("Failed to assign IDs to all GCN agents");
       return false;
@@ -4354,6 +4391,366 @@ unlock:
   return retval;
 }
 
+
+static bool
+init_hip_runtime_functions (void)
+{
+  if (hip_fns.hipStreamCreate_fn)
+    return true;
+
+  void *handle = dlopen (hip_runtime_lib, RTLD_LAZY);
+  if (handle == NULL)
+    return false;
+
+#define DLSYM_OPT_FN(function) \
+  hip_fns.function##_fn = dlsym (handle, #function)
+
+  DLSYM_OPT_FN (hipStreamCreate);
+  DLSYM_OPT_FN (hipStreamDestroy);
+  DLSYM_OPT_FN (hipCtxGetCurrent);
+  DLSYM_OPT_FN (hipGetDevice);
+  DLSYM_OPT_FN (hipSetDevice);
+#undef DLSYM_OPT_FN
+
+  if (!hip_fns.hipStreamCreate_fn
+      || !hip_fns.hipStreamDestroy_fn
+      || !hip_fns.hipCtxGetCurrent_fn
+      || !hip_fns.hipGetDevice_fn
+      || !hip_fns.hipSetDevice_fn)
+    {
+      hip_fns.hipStreamCreate_fn = NULL;
+      return false;
+    }
+
+  return true;
+}
+
+
+void
+GOMP_OFFLOAD_interop (struct interop_obj_t *obj, int ord, bool targetsync,
+		      bool destroy, const char *prefer_type,
+		      const int *prefer_type_int)
+{
+  if (destroy)
+    {
+      if (obj->stream && obj->fr == omp_ifr_hsa)
+	{
+	  hsa_status_t status
+	    = hsa_fns.hsa_queue_destroy_fn ((hsa_queue_t *) obj->stream);
+	  if (status != HSA_STATUS_SUCCESS)
+	    hsa_fatal ("Error destroying interop hsa_queue_t", status);
+	  free (obj->stream);
+	}
+      else if (obj->stream)
+	{
+	  hipError_t err
+	    = hip_fns.hipStreamDestroy_fn ((hipStream_t *) &obj->stream);
+	  if (err != 0)
+	    GOMP_PLUGIN_fatal ("Error destorying interop hipStream_t: %d", err);
+	  free (obj->stream);
+	}
+      return;
+    }
+
+  bool have_hip = init_hip_runtime_functions ();
+  obj->fr = have_hip ? omp_ifr_hip : omp_ifr_hsa;
+  int i = 0;
+
+  /* The 'fr' (first item) and 'attr' are separated by '\0'; each
+     preference-specifications ends with '\0\0'. The end of the list
+     is reached if another tailing '\0' follows.  */
+  if (prefer_type)
+    /* Outer loop over the foreign runtime string ('fr'); for numerical
+       values to be used instead, prefer_type_int != NULL and
+       prefer_type_int[i] != 0. the 'fr' string is ' ' in that case.
+       A preference-specification without a 'fr' has a ' ' for 'fr',
+       but prefer_type_int[i], if present, is 0.   */
+    while (*prefer_type)
+      {
+	if (prefer_type_int && prefer_type_int[i] != 0)
+	  {
+	    if (prefer_type_int[i] == omp_ifr_hip && have_hip)
+	      break;
+	    if (prefer_type_int[i] == omp_ifr_hsa)
+	      {
+		obj->fr = omp_ifr_hsa;
+		break;
+	      }
+	  }
+	else if (!strcmp (prefer_type, "hip"))
+	  break;
+	else if (!strcmp (prefer_type, "hsa"))
+	  {
+	    obj->fr = omp_ifr_hsa;
+	    break;
+	  }
+	prefer_type += 1 + strlen (prefer_type);
+
+	/* Loop over the optional attributes. */
+	while (*prefer_type)
+	  prefer_type += 1 + strlen (prefer_type);
+
+	++prefer_type;
+	++i;
+      }
+
+  _Static_assert (sizeof (uint64_t) == sizeof (hsa_agent_t),
+		  "sizeof (uint64_t) == sizeof (hsa_agent_t)");
+  struct agent_info *agent = get_agent_info (ord);
+  obj->device_data = agent;
+
+  if (targetsync && obj->fr == omp_ifr_hsa)
+    {
+/* RFC: Support HSA_QUEUE_TYPE_MULTI, HSA_QUEUE_TYPE_SINGLE, HSA_QUEUE_TYPE_COOPERATIVE ? */
+/* RFC: Support size? Must be power of 2 between 1 and the value of HSA_AGENT_INFO_QUEUE_MAX_SIZE.  */
+      obj->stream = malloc (sizeof (hsa_queue_t));
+
+      hsa_status_t status;
+      int32_t queue_size;
+      status = hsa_fns.hsa_agent_get_info_fn (agent->id,
+					      HSA_AGENT_INFO_QUEUE_MAX_SIZE,
+					      &queue_size);
+      if (status != HSA_STATUS_SUCCESS)
+	hsa_fatal ("Error obtaining HSA_AGENT_INFO_QUEUE_MAX_SIZE for interop "
+		   "hsa_queue_t", status);
+      status = hsa_fns.hsa_queue_create_fn (agent->id, queue_size,
+					    HSA_QUEUE_TYPE_MULTI,
+					    NULL, NULL, UINT32_MAX, UINT32_MAX,
+					    (hsa_queue_t **) &obj->stream);
+      if (status != HSA_STATUS_SUCCESS)
+	hsa_fatal ("Error creating interop hsa_queue_t", status);
+    }
+  else if (targetsync)
+    {
+/* RFC: Support priority and flags (hipStreamDefault, hipStreamNonBlocking)? */
+      hipError_t err
+	= hip_fns.hipStreamCreate_fn ((hipStream_t *) &obj->stream);
+      if (err != 0)
+	GOMP_PLUGIN_fatal ("Error creating interop hipStream_t: %d", err);
+    }
+}
+
+intptr_t
+GOMP_OFFLOAD_get_interop_int (struct interop_obj_t *obj,
+			      omp_interop_property_t property_id,
+			      omp_interop_rc_t *ret_code)
+{
+  if (obj->fr != omp_ifr_hip && obj->fr != omp_ifr_hsa)
+    {
+      *ret_code = omp_irc_no_value;  /* Hmm. */
+      return 0;
+    }
+  switch (property_id)
+    {
+    case omp_ipr_fr_id:
+      *ret_code = omp_irc_success;
+      return obj->fr;
+    case omp_ipr_fr_name:
+      *ret_code = omp_irc_type_str;
+      return 0;
+    case omp_ipr_vendor:
+      return 1; /* amd */
+    case omp_ipr_vendor_name:
+      *ret_code = omp_irc_type_str;
+      return 0;
+    case omp_ipr_device_num:
+      *ret_code = omp_irc_success;
+      return obj->device_num;
+    case omp_ipr_platform:
+      *ret_code = omp_irc_no_value;
+      return 0;
+    case omp_ipr_device:
+      if (obj->fr == omp_ifr_hsa)
+	{
+	  *ret_code = omp_irc_type_ptr;
+	  return 0;
+	}
+      else
+	return ((struct agent_info *) obj->device_data)->hsa_device_num;
+    case omp_ipr_device_context:
+      if (obj->fr == omp_ifr_hsa)
+	*ret_code = omp_irc_no_value;
+      else
+	*ret_code = omp_irc_type_ptr;
+      return 0;
+    case omp_ipr_targetsync:
+      if (!obj->stream)
+	*ret_code = omp_irc_no_value;
+      else
+	*ret_code = omp_irc_type_ptr;
+      return 0;
+    default:
+      break;
+    }
+  __builtin_unreachable ();
+  return 0;
+}
+
+void *
+GOMP_OFFLOAD_get_interop_ptr (struct interop_obj_t *obj,
+			      omp_interop_property_t property_id,
+			      omp_interop_rc_t *ret_code)
+{
+  if (obj->fr != omp_ifr_hip && obj->fr != omp_ifr_hsa)
+    {
+      *ret_code = omp_irc_no_value;  /* Hmm. */
+      return 0;
+    }
+  switch (property_id)
+    {
+    case omp_ipr_fr_id:
+      *ret_code = omp_irc_type_int;
+      return NULL;
+    case omp_ipr_fr_name:
+      *ret_code = omp_irc_type_str;
+      return NULL;
+    case omp_ipr_vendor:
+      *ret_code = omp_irc_type_str;
+      return NULL;
+    case omp_ipr_vendor_name:
+      *ret_code = omp_irc_type_str;
+      return NULL;
+    case omp_ipr_device_num:
+      *ret_code = omp_irc_type_int;
+      return NULL;
+    case omp_ipr_platform:
+      *ret_code = omp_irc_no_value;
+      return NULL;
+    case omp_ipr_device:
+      if (obj->fr == omp_ifr_hsa)
+	{
+	  *ret_code = omp_irc_success;
+	  /* hsa_agent_t is an struct containing a single uint64_t. */
+	  return &((struct agent_info *) obj->device_data)->id;
+	}
+      else
+	{
+	  *ret_code = omp_irc_type_int;
+	  return NULL;
+	}
+    case omp_ipr_device_context:
+      if (obj->fr == omp_ifr_hsa)
+	{
+	  *ret_code = omp_irc_no_value;
+	  return NULL;
+	}
+      else
+        {
+	  hipCtx_t ctx;
+	  int dev_curr;
+	  int dev = ((struct agent_info *) obj->device_data)->hsa_device_num;
+	  hipError_t err;
+	  err = hip_fns.hipGetDevice_fn (&dev_curr);
+	  if (!err)
+	    err = hip_fns.hipSetDevice_fn (dev);
+	  if (!err)
+	    err = hip_fns.hipCtxGetCurrent_fn (&ctx);
+	  if (!err)
+	    err = hip_fns.hipSetDevice_fn (dev_curr);
+	  if (err)
+	    GOMP_PLUGIN_fatal ("Error obtaining hipCtx_t: %d", err);
+	  *ret_code = omp_irc_success;
+	  return ctx;
+	}
+    case omp_ipr_targetsync:
+      if (!obj->stream)
+	{
+	  *ret_code = omp_irc_no_value;
+	  return NULL;
+	}
+      *ret_code = omp_irc_success;
+      return obj->stream;
+    default:
+      break;
+    }
+  __builtin_unreachable ();
+  return NULL;
+}
+
+const char *
+GOMP_OFFLOAD_get_interop_str (struct interop_obj_t *obj,
+			      omp_interop_property_t property_id,
+			      omp_interop_rc_t *ret_code)
+{
+  if (obj->fr != omp_ifr_cuda
+      && obj->fr != omp_ifr_cuda_driver
+      && obj->fr != omp_ifr_hip)
+    {
+      *ret_code = omp_irc_no_value;  /* Hmm. */
+      return 0;
+    }
+  switch (property_id)
+    {
+    case omp_ipr_fr_id:
+      *ret_code = omp_irc_type_int;
+      return NULL;
+    case omp_ipr_fr_name:
+      *ret_code = omp_irc_success;
+      return "amd";
+    case omp_ipr_vendor:
+      *ret_code = omp_irc_type_int;
+      return NULL;
+    case omp_ipr_vendor_name:
+      *ret_code = omp_irc_success;
+      if (obj->fr == omp_ifr_hip)
+	return "hip";
+      if (obj->fr == omp_ifr_hsa)
+	return "hsa";
+      break;
+    case omp_ipr_device_num:
+      *ret_code = omp_irc_type_int;
+      return NULL;
+    case omp_ipr_platform:
+      *ret_code = omp_irc_no_value;
+      return NULL;
+    case omp_ipr_device:
+      if (obj->fr == omp_ifr_hsa)
+	*ret_code = omp_irc_type_ptr;
+      else
+	*ret_code = omp_irc_type_int;
+      return NULL;
+    case omp_ipr_device_context:
+      if (obj->fr == omp_ifr_hsa)
+	*ret_code = omp_irc_no_value;
+      else
+	*ret_code = omp_irc_type_ptr;
+      return NULL;
+    case omp_ipr_targetsync:
+      if (!obj->stream)
+	*ret_code = omp_irc_no_value;
+      else
+	*ret_code = omp_irc_type_ptr;
+      return NULL;
+    default:
+      break;
+    }
+  __builtin_unreachable ();
+  return 0;
+}
+
+const char *
+GOMP_OFFLOAD_get_interop_type_desc (struct interop_obj_t *obj,
+				    omp_interop_property_t property_id)
+{
+  _Static_assert (omp_ipr_targetsync == omp_ipr_first,
+		  "omp_ipr_targetsync == omp_ipr_first");
+  _Static_assert (omp_ipr_platform - omp_ipr_first + 1 == 4,
+                  "omp_ipr_platform - omp_ipr_first + 1 == 4");
+  static const char *desc_hip[] = {"N/A",		/* platform */
+				   "hipDevice_t",	/* device */
+				   "hipCtx_t",		/* device_context */
+				   "hipStream_t"};	/* targetsync */
+  static const char *desc_hsa[] = {"N/A",		/* platform */
+				   "hsa_agent_t *",	/* device */
+				   "N/A",		/* device_context */
+				   "hsa_queue_t *"};	/* targetsync */
+  if (obj->fr != omp_ifr_hip)
+    return desc_hip[omp_ipr_platform - property_id];
+  if (obj->fr != omp_ifr_hsa)
+    return desc_hsa[omp_ipr_platform - property_id];
+  return NULL;
+}
+
 /* }}}  */
 /* {{{ OpenMP Plugin API  */
 
diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index 99cbcb699b3..27e00c7fb0f 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -35,7 +35,9 @@
 #include "openacc.h"
 #include "config.h"
 #include "symcat.h"
+#define _LIBGOMP_PLUGIN_INCLUDE 1
 #include "libgomp-plugin.h"
+#undef _LIBGOMP_PLUGIN_INCLUDE
 #include "oacc-plugin.h"
 #include "gomp-constants.h"
 #include "oacc-int.h"
@@ -2384,6 +2386,286 @@ nvptx_stacks_acquire (struct ptx_device *ptx_dev, size_t size, int num)
   return (void *) ptx_dev->omp_stacks.ptr;
 }
 
+void
+GOMP_OFFLOAD_interop (struct interop_obj_t *obj, int ord, bool targetsync,
+		      bool destroy, const char *prefer_type,
+		      const int *prefer_type_int)
+{
+  obj->fr = omp_ifr_cuda;
+  int i = 0;
+
+  if (destroy)
+    {
+      if (targetsync)
+ 	CUDA_CALL_ASSERT (cuStreamDestroy, obj->stream);
+      return;
+    }
+
+  /* The 'fr' (first item) and 'attr' are separated by '\0'; each
+     preference-specifications ends with '\0\0'. The end of the list
+     is reached if another tailing '\0' follows.  */
+  if (prefer_type)
+    /* Outer loop over the foreign runtime string ('fr'); for numerical
+       values to be used instead, prefer_type_int != NULL and
+       prefer_type_int[i] != 0. the 'fr' string is ' ' in that case.
+       A preference-specification without a 'fr' has a ' ' for 'fr',
+       but prefer_type_int[i], if present, is 0.   */
+    while (*prefer_type)
+      {
+	if (prefer_type_int && prefer_type_int[i] != 0)
+	  {
+	    if (prefer_type_int[i] == omp_ifr_cuda)
+	      break;
+	    if (prefer_type_int[i] == omp_ifr_cuda_driver)
+	      {
+		obj->fr = omp_ifr_cuda_driver;
+		break;
+	      }
+	    if (prefer_type_int[i] == omp_ifr_hip)
+	      {
+		obj->fr = omp_ifr_hip;
+		break;
+	      }
+	  }
+	else if (!strcmp (prefer_type, "cuda"))
+	  break;
+	if (!strcmp (prefer_type, "cuda_driver"))
+	  {
+	    obj->fr = omp_ifr_cuda_driver;
+	    break;
+	  }
+	if (!strcmp (prefer_type, "hip"))
+	  {
+	    obj->fr = omp_ifr_hip;
+	    break;
+	  }
+	prefer_type += 1 + strlen (prefer_type);
+
+	/* Loop over the optional attributes. */
+	while (*prefer_type)
+	  prefer_type += 1 + strlen (prefer_type);
+
+	++prefer_type;
+	++i;
+      }
+
+  obj->device_data = ptx_devices[ord];
+
+  if (targetsync)
+    {
+/* RFC: CU_STREAM_DEFAULT vs. CU_STREAM_NON_BLOCKING. */
+/* RFC: Optionally support cuStreamCreateWithPriority with int priority. */
+      CUstream stream = NULL;
+      CUDA_CALL_ASSERT (cuStreamCreate, &stream, CU_STREAM_DEFAULT);
+      obj->stream = stream;
+    }
+}
+
+
+intptr_t
+GOMP_OFFLOAD_get_interop_int (struct interop_obj_t *obj,
+			      omp_interop_property_t property_id,
+			      omp_interop_rc_t *ret_code)
+{
+  if (obj->fr != omp_ifr_cuda
+      && obj->fr != omp_ifr_cuda_driver
+      && obj->fr != omp_ifr_hip)
+    {
+      *ret_code = omp_irc_no_value;  /* Hmm. */
+      return 0;
+    }
+  switch (property_id)
+    {
+    case omp_ipr_fr_id:
+      *ret_code = omp_irc_success;
+      return obj->fr;
+    case omp_ipr_fr_name:
+      *ret_code = omp_irc_type_str;
+      return 0;
+    case omp_ipr_vendor:
+      *ret_code = omp_irc_success;
+      return 11; /* nvidia */
+    case omp_ipr_vendor_name:
+      *ret_code = omp_irc_type_str;
+      return 0;
+    case omp_ipr_device_num:
+      *ret_code = omp_irc_success;
+      return obj->device_num;
+    case omp_ipr_platform:
+      *ret_code = omp_irc_no_value;
+      return 0;
+    case omp_ipr_device:
+      *ret_code = omp_irc_success;
+      return ((struct ptx_device *) obj->device_data)->dev;
+    case omp_ipr_device_context:
+      if (obj->fr == omp_ifr_cuda)
+	*ret_code = omp_irc_no_value;
+      else
+	*ret_code = omp_irc_type_ptr;
+      return 0;
+    case omp_ipr_targetsync:
+      if (!obj->stream)
+	{
+	  *ret_code = omp_irc_no_value;
+	  return 0;
+	}
+      /* ptr fits into (u)intptr_t */
+      *ret_code = omp_irc_success;
+      return (uintptr_t) obj->stream;
+    default:
+      break;
+    }
+  __builtin_unreachable ();
+  return 0;
+}
+
+void *
+GOMP_OFFLOAD_get_interop_ptr (struct interop_obj_t *obj,
+			      omp_interop_property_t property_id,
+			      omp_interop_rc_t *ret_code)
+{
+  if (obj->fr != omp_ifr_cuda
+      && obj->fr != omp_ifr_cuda_driver
+      && obj->fr != omp_ifr_hip)
+    {
+      *ret_code = omp_irc_no_value;  /* Hmm. */
+      return 0;
+    }
+  switch (property_id)
+    {
+    case omp_ipr_fr_id:
+      *ret_code = omp_irc_type_int;
+      return NULL;
+    case omp_ipr_fr_name:
+      *ret_code = omp_irc_type_str;
+      return NULL;
+    case omp_ipr_vendor:
+      *ret_code = omp_irc_type_int;
+      return NULL;
+    case omp_ipr_vendor_name:
+      *ret_code = omp_irc_type_str;
+      return NULL;
+    case omp_ipr_device_num:
+      *ret_code = omp_irc_type_int;
+      return NULL;
+    case omp_ipr_platform:
+      *ret_code = omp_irc_no_value;
+      return NULL;
+    case omp_ipr_device:
+      *ret_code = omp_irc_type_int;
+      return NULL;
+    case omp_ipr_device_context:
+      if (obj->fr == omp_ifr_cuda)
+	{
+	  *ret_code = omp_irc_no_value;
+	  return NULL;
+	}
+      *ret_code = omp_irc_success;
+      return ((struct ptx_device *) obj->device_data)->ctx;
+    case omp_ipr_targetsync:
+      if (!obj->stream)
+	{
+	  *ret_code = omp_irc_no_value;
+	  return NULL;
+	}
+      *ret_code = omp_irc_success;
+      return obj->stream;
+    default:
+      break;
+    }
+  __builtin_unreachable ();
+  return NULL;
+}
+
+const char *
+GOMP_OFFLOAD_get_interop_str (struct interop_obj_t *obj,
+			      omp_interop_property_t property_id,
+			      omp_interop_rc_t *ret_code)
+{
+  if (obj->fr != omp_ifr_cuda
+      && obj->fr != omp_ifr_cuda_driver
+      && obj->fr != omp_ifr_hip)
+    {
+      *ret_code = omp_irc_no_value;  /* Hmm. */
+      return 0;
+    }
+  switch (property_id)
+    {
+    case omp_ipr_fr_id:
+      *ret_code = omp_irc_type_int;
+      return NULL;
+    case omp_ipr_fr_name:
+      *ret_code = omp_irc_success;
+      return "nvidia";
+    case omp_ipr_vendor:
+      *ret_code = omp_irc_type_int;
+      return NULL;
+    case omp_ipr_vendor_name:
+      *ret_code = omp_irc_success;
+      if (obj->fr == omp_ifr_cuda)
+	return "cuda";
+      if (obj->fr == omp_ifr_cuda_driver)
+	return "cuda_driver";
+      if (obj->fr == omp_ifr_hip)
+	return "hip";
+      break;
+    case omp_ipr_device_num:
+      *ret_code = omp_irc_type_int;
+      return NULL;
+    case omp_ipr_platform:
+      *ret_code = omp_irc_no_value;
+      return NULL;
+    case omp_ipr_device:
+      *ret_code = omp_irc_type_ptr;
+      return NULL;
+    case omp_ipr_device_context:
+      if (obj->fr == omp_ifr_cuda)
+	*ret_code = omp_irc_no_value;
+      else
+	*ret_code = omp_irc_type_ptr;
+      return NULL;
+    case omp_ipr_targetsync:
+      if (!obj->stream)
+	*ret_code = omp_irc_no_value;
+      else
+	*ret_code = omp_irc_type_ptr;
+      return NULL;
+    default:
+      break;
+    }
+  __builtin_unreachable ();
+  return NULL;
+}
+
+const char *
+GOMP_OFFLOAD_get_interop_type_desc (struct interop_obj_t *obj,
+				    omp_interop_property_t property_id)
+{
+  _Static_assert (omp_ipr_targetsync == omp_ipr_first,
+		  "omp_ipr_targetsync == omp_ipr_first");
+  _Static_assert (omp_ipr_platform - omp_ipr_first + 1 == 4,
+		  "omp_ipr_platform - omp_ipr_first + 1 == 4");
+  static const char *desc_cuda[] = {"N/A",		/* platform */
+				    "int",		/* device */
+				    "N/A",		/* device_context */
+				    "cudaStream_t"};	/* targetsync */
+  static const char *desc_cuda_driver[] = {"N/A",	/* platform */
+					   "CUdevice",	/* device */
+					   "CUcontext",	/* device_context */
+					   "CUstream"};	/* targetsync */
+  static const char *desc_hip[] = {"N/A",		/* platform */
+				   "hipDevice_t",	/* device */
+				   "hipCtx_t",		/* device_context */
+				   "hipStream_t"};	/* targetsync */
+			     
+  if (obj->fr != omp_ifr_cuda)
+    return desc_cuda[omp_ipr_platform - property_id];
+  if (obj->fr != omp_ifr_cuda_driver)
+    return desc_cuda_driver[omp_ipr_platform - property_id];
+  if (obj->fr != omp_ifr_hip)
+    return desc_hip[omp_ipr_platform - property_id];
+  return NULL;
+}
 
 void
 GOMP_OFFLOAD_run (int ord, void *tgt_fn, void *tgt_vars, void **args)
diff --git a/libgomp/target.c b/libgomp/target.c
index cc1074243e0..14f06d388c4 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -5122,40 +5122,105 @@ omp_get_num_interop_properties (const omp_interop_t interop
   return 0;
 }
 
+void
+GOMP_interop (void *interop, int device_num, bool targetsync, bool destroy,
+	      const char *preferred_type, const int *preferred_type_int)
+{
+  struct interop_obj_t **obj = (struct interop_obj_t **) interop;
+  if (destroy && *obj == NULL)
+    return;
+  if (destroy)
+    device_num = (*obj)->device_num;
+
+  struct gomp_device_descr *devicep = resolve_device (device_num, false);
+  if (devicep == NULL || (!destroy && devicep->get_interop_func == NULL))
+    {
+      *obj = NULL;
+      return;
+    }
+  if (!destroy)
+    *obj = (struct interop_obj_t *) calloc (1, sizeof (struct interop_obj_t));
+  if (devicep->get_interop_func)
+    devicep->get_interop_func (*obj, devicep->target_id, targetsync, destroy,
+			       preferred_type, preferred_type_int);
+  if (!destroy)
+    (*obj)->device_num = (device_num == GOMP_DEVICE_ICV
+		       ? gomp_icv (false)->default_device_var : device_num);
+  else
+    {
+      free (*obj);
+      *obj = NULL;
+    }
+}
+
+
 omp_intptr_t
-omp_get_interop_int (const omp_interop_t interop __attribute__ ((unused)),
+omp_get_interop_int (const omp_interop_t interop,
 		     omp_interop_property_t property_id,
 		     omp_interop_rc_t *ret_code)
 {
+  struct interop_obj_t *obj = (struct interop_obj_t *) interop;
+  struct gomp_device_descr *devicep;
+
   if (property_id < omp_ipr_first || property_id >= 0)
-    *ret_code = omp_irc_out_of_range;
-  else
-    *ret_code = omp_irc_empty;  /* Assume omp_interop_none.  */
-  return 0;
+    {
+      *ret_code = omp_irc_out_of_range;
+      return 0;
+    }
+  if (obj == NULL
+      || (devicep = resolve_device (obj->device_num, false)) == NULL
+      || devicep->get_interop_int_func == NULL)
+    {
+      *ret_code = omp_irc_empty;  /* Assume omp_interop_none.  */
+      return 0;
+    }
+  return devicep->get_interop_int_func (obj, property_id, ret_code);
 }
 
 void *
-omp_get_interop_ptr (const omp_interop_t interop __attribute__ ((unused)),
+omp_get_interop_ptr (const omp_interop_t interop,
 		     omp_interop_property_t property_id,
 		     omp_interop_rc_t *ret_code)
 {
+  struct interop_obj_t *obj = (struct interop_obj_t *) interop;
+  struct gomp_device_descr *devicep;
+
   if (property_id < omp_ipr_first || property_id >= 0)
-    *ret_code = omp_irc_out_of_range;
-  else
-    *ret_code = omp_irc_empty;  /* Assume omp_interop_none.  */
-  return NULL;
+    {
+      *ret_code = omp_irc_out_of_range;
+      return NULL;
+    }
+  if (obj == NULL
+      || (devicep = resolve_device (obj->device_num, false)) == NULL
+      || devicep->get_interop_ptr_func == NULL)
+    {
+      *ret_code = omp_irc_empty;  /* Assume omp_interop_none.  */
+      return NULL;
+    }
+  return devicep->get_interop_ptr_func (obj, property_id, ret_code);
 }
 
 const char *
-omp_get_interop_str (const omp_interop_t interop __attribute__ ((unused)),
+omp_get_interop_str (const omp_interop_t interop,
 		     omp_interop_property_t property_id,
 		     omp_interop_rc_t *ret_code)
 {
+  struct interop_obj_t *obj = (struct interop_obj_t *) interop;
+  struct gomp_device_descr *devicep;
+
   if (property_id < omp_ipr_first || property_id >= 0)
-    *ret_code = omp_irc_out_of_range;
-  else
-    *ret_code = omp_irc_empty;  /* Assume omp_interop_none.  */
-  return NULL;
+    {
+      *ret_code = omp_irc_out_of_range;
+      return NULL;
+    }
+  if (obj == NULL
+      || (devicep = resolve_device (obj->device_num, false)) == NULL
+      || devicep->get_interop_str_func == NULL)
+    {
+      *ret_code = omp_irc_empty;  /* Assume omp_interop_none.  */
+      return NULL;
+    }
+  return devicep->get_interop_str_func (obj, property_id, ret_code);
 }
 
 const char *
@@ -5171,22 +5236,31 @@ omp_get_interop_name (const omp_interop_t interop __attribute__ ((unused)),
 }
 
 const char *
-omp_get_interop_type_desc (const omp_interop_t interop __attribute__ ((unused)),
-			   omp_interop_property_t property_id
-			   __attribute__ ((unused)))
+omp_get_interop_type_desc (const omp_interop_t interop,
+			   omp_interop_property_t property_id)
 {
-  static const char *desc = {"int",             /* fr_id */
-                             "const char*",     /* fr_name */
-                             "int",             /* vendor */
-                             "const char *",    /* vendor_name */
-                             "int"};            /* device_num */
-  if (interop == omp_interop_none)
+  static const char *desc[omp_ipr_fr_id - omp_ipr_device_num + 1]
+    = {"int",             /* fr_id */
+       "const char*",     /* fr_name */
+       "int",             /* vendor */
+       "const char *",    /* vendor_name */
+       "int"};            /* device_num */
+
+  struct interop_obj_t *obj = (struct interop_obj_t *) interop;
+  struct gomp_device_descr *devicep;
+
+  if (property_id > omp_ipr_fr_id || property_id < omp_ipr_first)
     return NULL;
-  if (property_id > fr_id || property_id < omp_ipr_first)
+
+  if (obj == NULL
+      || (devicep = resolve_device (obj->device_num, false)) == NULL
+      || devicep->get_interop_type_desc_func == NULL)
     return NULL;
+
   if (property_id >= omp_ipr_device_num)
     return desc[omp_ipr_fr_id - property_id];
-  return NULL; /* Fixme: Call plugin.  */
+
+  return devicep->get_interop_type_desc_func (obj, property_id);
 }
 
 const char *
@@ -5271,6 +5345,14 @@ gomp_load_plugin_for_device (struct gomp_device_descr *device,
   DLSYM (host2dev);
   DLSYM_OPT (memcpy2d, memcpy2d);
   DLSYM_OPT (memcpy3d, memcpy3d);
+  if (DLSYM_OPT (get_interop, get_interop))
+    {
+      DLSYM (get_interop_int);
+      DLSYM (get_interop_ptr);
+      DLSYM (get_interop_str);
+      DLSYM (get_interop_type_desc);
+    }
+
   device->capabilities = device->get_caps_func ();
   if (device->capabilities & GOMP_OFFLOAD_CAP_OPENMP_400)
     {

Reply via email to