Hi Andrew,

Am 28.06.24 um 12:24 schrieb Andrew Stubbs:
--- a/gcc/config/gcn/gcn.cc
+++ b/gcc/config/gcn/gcn.cc
@@ -70,6 +70,11 @@ static bool ext_gcn_constants_init = 0;
enum gcn_isa gcn_isa = ISA_GCN3; /* Default to GCN3. */ +/* Record whether the host compiler added "omp unifed memory" attributes to
+   any functions.  We can then pass this on to mkoffload to ensure xnack is
+   compatible there too.  */
+static bool unified_shared_memory_enabled = false;

…

Why is this needed instead of relying on omp_requires?

+  if (fndecl && lookup_attribute ("omp unified memory",
+       case PROCESSOR_GFX1036:
+       case PROCESSOR_GFX1100:
+       case PROCESSOR_GFX1103:
+         error ("GPU architecture does not support Unified Shared Memory");
+         break;

This seems to be wrong in two ways:

First, while not really usable, it feels wrong to print an error (and not a warning) if USM is not supported. Running the code with host fallback seems to be fine, albeit there is admittedly the question whether it makes sense to generate the GCN code in this case.

Secondly, gfx1036 has the property HSA_AMD_SYSTEM_INFO_SVM_ACCESSIBLE_BY_DEFAULT, i.e. this APU supports USM, albeit not XNACK.

* * *

+       default:
+         if (flag_xnack == HSACO_ATTR_OFF)
+           error ("Unified Shared Memory is enabled, but XNACK is disabled");

Likewise – I understand that USM won't work in this case, but the question is whether that should be a warning or an error as it does work (by using host fallback in this case).

* * *

+  /* Emit a constructor function to set the HSA_XNACK environment variable.
+     This must be done before the ROCr runtime library is loaded.
+     We never override a user value (exit empty string), but we do emit a
+     useful diagnostic in the wrong mode (the ROCr message is not good.  */
+  if (TEST_XNACK_OFF (elf_flags) && unified_shared_memory_enabled)
+    fatal_error (input_location,
+                "conflicting settings; XNACK is forced off but Unified "
+                "Shared Memory is on");

Is this reachable? I thought the code in gcn-lto1 already prints a diagnostic?

Otherwise, the same applies here: error vs. warning.

+  if (!TEST_XNACK_ANY (elf_flags) || unified_shared_memory_enabled)

I think you need to exclude XNACK UNSUPPORTED on the RHS, albeit I might have missed some condition, which ensures that unified_shared_memory_enabled is not set in that case.

+    fprintf (cfile,
+            "static __attribute__((constructor))\n"
+            "void configure_xnack (void)\n"
+            "{\n"

Constructors are somewhat expensive. Why don't you combine all three constructors features into a single constructor?

+            "  const char *val = getenv (\"HSA_XNACK\");\n"
+            "  if (!val || val[0] == '\\0')\n"
+            "    setenv (\"HSA_XNACK\", \"%d\", true);\n"
+            "  else if (%s)\n"
+            "    {\n"
+            "      fprintf (stderr, \"error: HSA_XNACK=%%s is incompatible; "
+                           "please unset\\n\", val);\n"
+            "      exit (1);\n"
+            "    }\n"

This looks wrong – not having support for USM is not an error but should fall back to a host execution.

But, admittedly, XNACK- and HSA_XNACK=1 or +/0 are incompatible, i.e. we might want to keep the warning in that case.

--- a/gcc/omp-low.cc
+++ b/gcc/omp-low.cc
@@ -2124,6 +2124,10 @@ create_omp_child_function (omp_context *ctx, bool 
task_copy)
        DECL_ATTRIBUTES (decl)
          = tree_cons (get_identifier (target_attr),
                       NULL_TREE, DECL_ATTRIBUTES (decl));
+      if (omp_requires_mask & OMP_REQUIRES_UNIFIED_SHARED_MEMORY)
+       DECL_ATTRIBUTES (decl)
+         = tree_cons (get_identifier ("omp unified memory"),
+                      NULL_TREE, DECL_ATTRIBUTES (decl));

As mentioned, it is unclear to me why 'omp_requires' is not enough. In any case, it also needs to handle the (admittedly newer) self_maps flag.

Finally, I think it would be user friendly to mention this feature in libgomp.texi.

Tobias

Reply via email to