My recent patch to fix barriers in nested teams relied on the assumption that nested teams would only ever have one thread each.

However, that can be changed by altering the ICVs, via runtime call or environment variable (not that the accelerator-side libgomp can see the host environment), so it wasn't completely safe.

This patch ensures that the previous assumption is safe, by ignoring the relevant ICV on NVPTX and AMD GCN, neither of which can support it.

OK to commit?

Andrew
libgomp: Enforce 1-thread limit in subteams

Accelerators with fixed thread-counts will break if nested teams are expected
to have multiple threads each.

libgomp/ChangeLog:

2020-09-29  Andrew Stubbs  <a...@codesourcery.com>

	* parallel.c (gomp_resolve_num_threads): Ignore nest_var on nvptx
	and amdgcn targets.

diff --git a/libgomp/parallel.c b/libgomp/parallel.c
index 2423f11f44a..0618056a7fe 100644
--- a/libgomp/parallel.c
+++ b/libgomp/parallel.c
@@ -48,7 +48,14 @@ gomp_resolve_num_threads (unsigned specified, unsigned count)
 
   if (specified == 1)
     return 1;
-  else if (thr->ts.active_level >= 1 && !icv->nest_var)
+
+  /* Accelerators with fixed thread counts require this to return 1 for
+     nested parallel regions.  */
+  if (thr->ts.active_level >= 1
+#if !defined(__AMDGCN__) && !defined(__nvptx__)
+      && !icv->nest_var
+#endif
+      )
     return 1;
   else if (thr->ts.active_level >= gomp_max_active_levels_var)
     return 1;

Reply via email to