[gomp5] Add host teams construct support

Jakub Jelinek Wed, 25 Jul 2018 09:29:27 -0700

Hi!

OpenMP 5.0 allows the teams construct, previously required to be strictly
nested (i.e. without any OpenMP construct in between) and even with no user
code in between inside of target construct, also as a host construct that is
not nested in any OpenMP construct at all.  The primary goal is for use
in NUMA setups, where the construct will create a league of teams (threads)
where each binds to one NUMA node and let those teams run pretty much
independently (no synchronization between them, except for the final
reduction processing if required), only wait for the work of all the teams
at the end of the construct.


This patch implements the compiler side of this, and adds a simple (if
ignoring mixing of user POSIX threads and OpenMP like calling #pragma omp
teams from multiple POSIX threads (which is outside of the standard anyway)
even conforming) implementation to the library for now so that it can be
tested.  The implementation just runs the different teams in the same thread
sequentially.

When I get some agreement on the defaults (I believe the default if
num_threads is not specified should be number of NUMA nodes the CPUs in the
allowed set of CPUs belong to (1 if not on a NUMA system); if users specify
something smaller or larger, there are various options and we need to figure
out what is best for users), I'll change it to the final implementation.

Tested on x86_64-linux, committed to gomp-5_0-branch.

2018-07-25  Jakub Jelinek  <ja...@redhat.com>

        * gimple.h (enum gf_mask): Add GF_OMP_TEAMS_HOST.
        (struct gimple_statement_omp_taskreg): Add GIMPLE_OMP_TEAMS to
        comments.
        (struct gimple_statement_omp_single_layout): And remove here.
        (struct gomp_teams): Inherit from gimple_statement_omp_taskreg rather
        than gimple_statement_omp_single_layout.
        (is_a_helper <gimple_statement_omp_taskreg *>::test): Allow
        GIMPLE_OMP_TEAMS.
        (is_a_helper <const gimple_statement_omp_taskreg *>::test): Likewise.
        (gimple_omp_subcode): Formatting fix.
        (gimple_omp_teams_child_fn, gimple_omp_teams_child_fn_ptr,
        gimple_omp_teams_set_child_fn, gimple_omp_teams_data_arg,
        gimple_omp_teams_data_arg_ptr, gimple_omp_teams_set_data_arg,
        gimple_omp_teams_host, gimple_omp_teams_set_host): New inline
        functions.
        * gimple.def (GIMPLE_OMP_TEAMS): Use GSS_OMP_PARALLEL_LAYOUT instead
        of GSS_OMP_SINGLE_LAYOUT, adjust comments.
        * gimplify.c (enum omp_region_type): Reserve bits 1 and 2 for
        auxiliary flags, renumber values of most of ORT_* enumerators,
        add ORT_HOST_TEAMS and ORT_COMBINED_HOST_TEAMS enumerators.
        (maybe_fold_stmt): Don't fold even in host teams regions.
        (gimplify_scan_omp_clauses, gimplify_omp_for): Adjust tests for
        ORT_COMBINED_TEAMS.
        (gimplify_omp_workshare): Set ort to ORT_HOST_TEAMS or
        ORT_COMBINED_HOST_TEAMS if not inside of target construct.  If
        host teams, use gimplify_and_return_first etc. for body like
        for target or target data constructs, and at the end call
        gimple_omp_teams_set_host on the GIMPLE_OMP_TEAMS object.
        * omp-builtins.def (BUILT_IN_GOMP_TEAMS_REG): New builtin.
        * omp-low.c (is_host_teams_ctx): New function.
        (is_taskreg_ctx): Return true also if is_host_teams_ctx.
        (scan_sharing_clauses): Don't ignore shared clauses in
        is_host_teams_ctx contexts.
        (finish_taskreg_scan): Handle GIMPLE_OMP_TEAMS like
        GIMPLE_OMP_PARALLEL.
        (scan_omp_teams): Handle host teams constructs.
        (check_omp_nesting_restrictions): Allow teams with no outer
        OpenMP context.  Adjust diagnostics for teams strictly nested into
        some explicit OpenMP construct other than target.
        (scan_omp_1_stmt) <case GIMPLE_OMP_TEAMS>: Temporarily bump
        taskreg_nesting_level while scanning host teams construct.
        (lower_rec_input_clauses): Don't ignore shared clauses in
        is_host_teams_ctx contexts.
        (lower_omp_1): Use lower_omp_taskreg instead of lower_omp_teams
        for host teams constructs.
        * omp-expand.c (expand_teams_call): New function.
        (expand_omp_taskreg): Allow GIMPLE_OMP_TEAMS and call
        expand_teams_call for it.  Formatting fix.
        (expand_omp_synch): For host teams call expand_omp_taskreg.
c/
        * c-parser.c (c_parser_omp_teams): Force a BIND_EXPR with BLOCK
        around teams body.  Use SET_EXPR_LOCATION.
        (c_parser_omp_target): Use SET_EXPR_LOCATION.
cp/
        * cp-tree.h (finish_omp_atomic): Add LOC argument.
        * parser.c (cp_parser_omp_atomic): Pass pragma_tok->location as
        LOC to finish_omp_atomic.
        (cp_parser_omp_single): Use SET_EXPR_LOCATION.
        (cp_parser_omp_teams): Force a BIND_EXPR with BLOCK around teams
        body.
        * semantics.c (finish_omp_atomic): Add LOC argument, pass it through
        to c_finish_omp_atomic and set it as location of OMP_ATOMIC* trees.
        * pt.c (tsubst_expr): Force a BIND_EXPR with BLOCK around teams body.
        Adjust finish_omp_atomic caller.
testsuite/
        * c-c++-common/gomp/teams-1.c: New test.
        * c-c++-common/gomp/teams-2.c: New test.
        * g++.dg/gomp/tpl-atomic-2.C: Adjust expected diagnostic lines.
        * gcc.dg/gomp/teams-1.c: Likewise.
libgomp/
        * Makefile.am (libgomp_la_SOURCES): Add teams.c.
        * libgomp_g.h (GOMP_teams_reg): New prototype.
        * libgomp.map (GOMP_5.0): Export GOMP_teams_reg.
        * icv-device.c (omp_get_num_teams, omp_get_team_num): Move these
        functions to ...
        * teams.c: ... here.  New file.
        * config/nvptx/icv-device.c (omp_get_num_teams, omp_get_team_num):
        Move these functions to ...
        * config/nvptx/teams.c: ... here.  New file.
        * testsuite/libgomp.c++/for-16.C: New test.
        * testsuite/libgomp.c++/for-26.C: New test.
        * testsuite/libgomp.c-c++-common/for-14.c: New test.
        * testsuite/libgomp.c-c++-common/for-15.c: New test.
        * testsuite/libgomp.c-c++-common/pr66199-10.c: New test.
        * testsuite/libgomp.c-c++-common/pr66199-11.c: New test.
        * testsuite/libgomp.c-c++-common/pr66199-12.c: New test.
        * testsuite/libgomp.c-c++-common/pr66199-13.c: New test.
        * testsuite/libgomp.c-c++-common/pr66199-14.c: New test.
        * testsuite/libgomp.c/teams-1.c: New test.
        * testsuite/libgomp.c/teams-2.c: New test.
        * testsuite/libgomp.c/thread-limit-5.c: New test.
        * testsuite/libgomp.c/thread-limit-4.c: New test.
        * Makefile.in: Regenerated.

--- gcc/gimple.h.jj     2018-07-17 13:06:02.690822069 +0200
+++ gcc/gimple.h        2018-07-23 18:10:02.209626199 +0200
@@ -184,6 +184,7 @@ enum gf_mask {
     GF_OMP_TARGET_KIND_OACC_DECLARE = 10,
     GF_OMP_TARGET_KIND_OACC_HOST_DATA = 11,
     GF_OMP_TEAMS_GRID_PHONY    = 1 << 0,
+    GF_OMP_TEAMS_HOST          = 1 << 1,
 
     /* True on an GIMPLE_OMP_RETURN statement if the return does not require
        a thread synchronization via some sort of barrier.  The exact barrier
@@ -638,7 +639,7 @@ struct GTY((tag("GSS_OMP_FOR")))
 };
 
 
-/* GIMPLE_OMP_PARALLEL, GIMPLE_OMP_TARGET, GIMPLE_OMP_TASK */
+/* GIMPLE_OMP_PARALLEL, GIMPLE_OMP_TARGET, GIMPLE_OMP_TASK, GIMPLE_OMP_TEAMS */
 
 struct GTY((tag("GSS_OMP_PARALLEL_LAYOUT")))
   gimple_statement_omp_parallel_layout : public gimple_statement_omp
@@ -664,7 +665,8 @@ struct GTY((tag("GSS_OMP_PARALLEL_LAYOUT
 {
     /* No extra fields; adds invariant:
          stmt->code == GIMPLE_OMP_PARALLEL
-        || stmt->code == GIMPLE_OMP_TASK.  */
+        || stmt->code == GIMPLE_OMP_TASK
+        || stmt->code == GIMPLE_OMP_TEAMS.  */
 };
 
 /* GIMPLE_OMP_PARALLEL */
@@ -738,8 +740,7 @@ struct GTY((tag("GSS_OMP_CONTINUE")))
   tree control_use;
 };
 
-/* GIMPLE_OMP_SINGLE, GIMPLE_OMP_TEAMS, GIMPLE_OMP_ORDERED,
-   GIMPLE_OMP_TASKGROUP.  */
+/* GIMPLE_OMP_SINGLE, GIMPLE_OMP_ORDERED, GIMPLE_OMP_TASKGROUP.  */
 
 struct GTY((tag("GSS_OMP_SINGLE_LAYOUT")))
   gimple_statement_omp_single_layout : public gimple_statement_omp
@@ -757,8 +758,8 @@ struct GTY((tag("GSS_OMP_SINGLE_LAYOUT")
          stmt->code == GIMPLE_OMP_SINGLE.  */
 };
 
-struct GTY((tag("GSS_OMP_SINGLE_LAYOUT")))
-  gomp_teams : public gimple_statement_omp_single_layout
+struct GTY((tag("GSS_OMP_PARALLEL_LAYOUT")))
+  gomp_teams : public gimple_statement_omp_taskreg
 {
     /* No extra fields; adds invariant:
          stmt->code == GIMPLE_OMP_TEAMS.  */
@@ -1123,7 +1124,9 @@ template <>
 inline bool
 is_a_helper <gimple_statement_omp_taskreg *>::test (gimple *gs)
 {
-  return gs->code == GIMPLE_OMP_PARALLEL || gs->code == GIMPLE_OMP_TASK;
+  return (gs->code == GIMPLE_OMP_PARALLEL
+         || gs->code == GIMPLE_OMP_TASK
+         || gs->code == GIMPLE_OMP_TEAMS);
 }
 
 template <>
@@ -1339,7 +1342,9 @@ template <>
 inline bool
 is_a_helper <const gimple_statement_omp_taskreg *>::test (const gimple *gs)
 {
-  return gs->code == GIMPLE_OMP_PARALLEL || gs->code == GIMPLE_OMP_TASK;
+  return (gs->code == GIMPLE_OMP_PARALLEL
+         || gs->code == GIMPLE_OMP_TASK
+         || gs->code == GIMPLE_OMP_TEAMS);
 }
 
 template <>
@@ -2196,7 +2201,7 @@ static inline unsigned
 gimple_omp_subcode (const gimple *s)
 {
   gcc_gimple_checking_assert (gimple_code (s) >= GIMPLE_OMP_ATOMIC_LOAD
-             && gimple_code (s) <= GIMPLE_OMP_TEAMS);
+                             && gimple_code (s) <= GIMPLE_OMP_TEAMS);
   return s->subcode;
 }
 
@@ -5920,6 +5925,60 @@ gimple_omp_teams_set_clauses (gomp_teams
   omp_teams_stmt->clauses = clauses;
 }
 
+/* Return the child function used to hold the body of OMP_TEAMS_STMT.  */
+
+static inline tree
+gimple_omp_teams_child_fn (const gomp_teams *omp_teams_stmt)
+{
+  return omp_teams_stmt->child_fn;
+}
+
+/* Return a pointer to the child function used to hold the body of
+   OMP_TEAMS_STMT.  */
+
+static inline tree *
+gimple_omp_teams_child_fn_ptr (gomp_teams *omp_teams_stmt)
+{
+  return &omp_teams_stmt->child_fn;
+}
+
+
+/* Set CHILD_FN to be the child function for OMP_TEAMS_STMT.  */
+
+static inline void
+gimple_omp_teams_set_child_fn (gomp_teams *omp_teams_stmt, tree child_fn)
+{
+  omp_teams_stmt->child_fn = child_fn;
+}
+
+
+/* Return the artificial argument used to send variables and values
+   from the parent to the children threads in OMP_TEAMS_STMT.  */
+
+static inline tree
+gimple_omp_teams_data_arg (const gomp_teams *omp_teams_stmt)
+{
+  return omp_teams_stmt->data_arg;
+}
+
+
+/* Return a pointer to the data argument for OMP_TEAMS_STMT.  */
+
+static inline tree *
+gimple_omp_teams_data_arg_ptr (gomp_teams *omp_teams_stmt)
+{
+  return &omp_teams_stmt->data_arg;
+}
+
+
+/* Set DATA_ARG to be the data argument for OMP_TEAMS_STMT.  */
+
+static inline void
+gimple_omp_teams_set_data_arg (gomp_teams *omp_teams_stmt, tree data_arg)
+{
+  omp_teams_stmt->data_arg = data_arg;
+}
+
 /* Return the kernel_phony flag of an OMP_TEAMS_STMT.  */
 
 static inline bool
@@ -5939,6 +5998,25 @@ gimple_omp_teams_set_grid_phony (gomp_te
     omp_teams_stmt->subcode &= ~GF_OMP_TEAMS_GRID_PHONY;
 }
 
+/* Return the host flag of an OMP_TEAMS_STMT.  */
+
+static inline bool
+gimple_omp_teams_host (const gomp_teams *omp_teams_stmt)
+{
+  return (gimple_omp_subcode (omp_teams_stmt) & GF_OMP_TEAMS_HOST) != 0;
+}
+
+/* Set host flag of an OMP_TEAMS_STMT to VALUE.  */
+
+static inline void
+gimple_omp_teams_set_host (gomp_teams *omp_teams_stmt, bool value)
+{
+  if (value)
+    omp_teams_stmt->subcode |= GF_OMP_TEAMS_HOST;
+  else
+    omp_teams_stmt->subcode &= ~GF_OMP_TEAMS_HOST;
+}
+
 /* Return the clauses associated with OMP_SECTIONS GS.  */
 
 static inline tree
--- gcc/gimple.def.jj   2018-04-30 13:50:11.404882261 +0200
+++ gcc/gimple.def      2018-07-19 18:25:54.117264751 +0200
@@ -367,10 +367,12 @@ DEFGSCODE(GIMPLE_OMP_SINGLE, "gimple_omp
    implement the MAP clauses.  */
 DEFGSCODE(GIMPLE_OMP_TARGET, "gimple_omp_target", GSS_OMP_PARALLEL_LAYOUT)
 
-/* GIMPLE_OMP_TEAMS <BODY, CLAUSES> represents #pragma omp teams
+/* GIMPLE_OMP_TEAMS <BODY, CLAUSES, CHILD_FN, DATA_ARG> represents
+   #pragma omp teams
    BODY is the sequence of statements inside the single section.
-   CLAUSES is an OMP_CLAUSE chain holding the associated clauses.  */
-DEFGSCODE(GIMPLE_OMP_TEAMS, "gimple_omp_teams", GSS_OMP_SINGLE_LAYOUT)
+   CLAUSES is an OMP_CLAUSE chain holding the associated clauses.
+   CHILD_FN and DATA_ARG like for GIMPLE_OMP_PARALLEL.  */
+DEFGSCODE(GIMPLE_OMP_TEAMS, "gimple_omp_teams", GSS_OMP_PARALLEL_LAYOUT)
 
 /* GIMPLE_OMP_ORDERED <BODY, CLAUSES> represents #pragma omp ordered.
    BODY is the sequence of statements to execute in the ordered section.
--- gcc/gimplify.c.jj   2018-07-17 17:24:39.975318596 +0200
+++ gcc/gimplify.c      2018-07-24 14:21:42.504593243 +0200
@@ -122,34 +122,36 @@ enum gimplify_omp_var_data
 enum omp_region_type
 {
   ORT_WORKSHARE = 0x00,
-  ORT_SIMD     = 0x01,
+  ORT_SIMD     = 0x04,
 
-  ORT_PARALLEL = 0x02,
-  ORT_COMBINED_PARALLEL = 0x03,
+  ORT_PARALLEL = 0x08,
+  ORT_COMBINED_PARALLEL = ORT_PARALLEL | 1,
 
-  ORT_TASK     = 0x04,
-  ORT_UNTIED_TASK = 0x05,
+  ORT_TASK     = 0x10,
+  ORT_UNTIED_TASK = ORT_TASK | 1,
 
-  ORT_TEAMS    = 0x08,
-  ORT_COMBINED_TEAMS = 0x09,
+  ORT_TEAMS    = 0x20,
+  ORT_COMBINED_TEAMS = ORT_TEAMS | 1,
+  ORT_HOST_TEAMS = ORT_TEAMS | 2,
+  ORT_COMBINED_HOST_TEAMS = ORT_COMBINED_TEAMS | 2,
 
   /* Data region.  */
-  ORT_TARGET_DATA = 0x10,
+  ORT_TARGET_DATA = 0x40,
 
   /* Data region with offloading.  */
-  ORT_TARGET   = 0x20,
-  ORT_COMBINED_TARGET = 0x21,
+  ORT_TARGET   = 0x80,
+  ORT_COMBINED_TARGET = ORT_TARGET | 1,
 
   /* OpenACC variants.  */
-  ORT_ACC      = 0x40,  /* A generic OpenACC region.  */
+  ORT_ACC      = 0x100,  /* A generic OpenACC region.  */
   ORT_ACC_DATA = ORT_ACC | ORT_TARGET_DATA, /* Data construct.  */
   ORT_ACC_PARALLEL = ORT_ACC | ORT_TARGET,  /* Parallel construct */
-  ORT_ACC_KERNELS  = ORT_ACC | ORT_TARGET | 0x80,  /* Kernels construct.  */
-  ORT_ACC_HOST_DATA = ORT_ACC | ORT_TARGET_DATA | 0x80,  /* Host data.  */
+  ORT_ACC_KERNELS  = ORT_ACC | ORT_TARGET | 2,  /* Kernels construct.  */
+  ORT_ACC_HOST_DATA = ORT_ACC | ORT_TARGET_DATA | 2,  /* Host data.  */
 
   /* Dummy OpenMP region, used to disable expansion of
      DECL_VALUE_EXPRs in taskloop pre body.  */
-  ORT_NONE     = 0x100
+  ORT_NONE     = 0x200
 };
 
 /* Gimplify hashtable helper.  */
@@ -3168,6 +3170,8 @@ maybe_fold_stmt (gimple_stmt_iterator *g
   for (ctx = gimplify_omp_ctxp; ctx; ctx = ctx->outer_context)
     if ((ctx->region_type & (ORT_TARGET | ORT_PARALLEL | ORT_TASK)) != 0)
       return false;
+    else if ((ctx->region_type & ORT_HOST_TEAMS) == ORT_HOST_TEAMS)
+      return false;
   return fold_stmt (gsi);
 }
 
@@ -8040,7 +8044,8 @@ gimplify_scan_omp_clauses (tree *list_p,
            }
          if (outer_ctx
              && (outer_ctx->region_type == ORT_COMBINED_PARALLEL
-                 || outer_ctx->region_type == ORT_COMBINED_TEAMS)
+                 || ((outer_ctx->region_type & ORT_COMBINED_TEAMS)
+                     == ORT_COMBINED_TEAMS))
              && splay_tree_lookup (outer_ctx->variables,
                                    (splay_tree_key) decl) == NULL)
            {
@@ -8088,7 +8093,8 @@ gimplify_scan_omp_clauses (tree *list_p,
                                            GOVD_LASTPRIVATE | GOVD_SEEN);
                          octx = octx->outer_context;
                          if (octx
-                             && octx->region_type == ORT_COMBINED_TEAMS
+                             && ((octx->region_type & ORT_COMBINED_TEAMS)
+                                 == ORT_COMBINED_TEAMS)
                              && (splay_tree_lookup (octx->variables,
                                                     (splay_tree_key) decl)
                                  == NULL))
@@ -8235,7 +8241,8 @@ gimplify_scan_omp_clauses (tree *list_p,
                           && octx == outer_ctx)
                    flags = GOVD_SEEN | GOVD_SHARED;
                  else if (octx
-                          && octx->region_type == ORT_COMBINED_TEAMS)
+                          && ((octx->region_type & ORT_COMBINED_TEAMS)
+                              == ORT_COMBINED_TEAMS))
                    flags = GOVD_SEEN | GOVD_SHARED;
                  else if (octx
                           && octx->region_type == ORT_COMBINED_TARGET)
@@ -10714,7 +10721,7 @@ gimplify_omp_for (tree *expr_p, gimple_s
                            }
                          if (outer && outer->outer_context
                              && (outer->outer_context->region_type
-                                 == ORT_COMBINED_TEAMS))
+                                 & ORT_COMBINED_TEAMS) == ORT_COMBINED_TEAMS)
                            {
                              outer = outer->outer_context;
                              n = splay_tree_lookup (outer->variables,
@@ -10801,7 +10808,7 @@ gimplify_omp_for (tree *expr_p, gimple_s
                            }
                          if (outer && outer->outer_context
                              && (outer->outer_context->region_type
-                                 == ORT_COMBINED_TEAMS))
+                                 & ORT_COMBINED_TEAMS) == ORT_COMBINED_TEAMS)
                            {
                              outer = outer->outer_context;
                              n = splay_tree_lookup (outer->variables,
@@ -11497,6 +11504,12 @@ gimplify_omp_workshare (tree *expr_p, gi
       break;
     case OMP_TEAMS:
       ort = OMP_TEAMS_COMBINED (expr) ? ORT_COMBINED_TEAMS : ORT_TEAMS;
+      if (gimplify_omp_ctxp == NULL
+         || (gimplify_omp_ctxp->region_type == ORT_TARGET
+             && gimplify_omp_ctxp->outer_context == NULL
+             && lookup_attribute ("omp declare target",
+                                  DECL_ATTRIBUTES (current_function_decl))))
+       ort = (enum omp_region_type) (ort | ORT_HOST_TEAMS);
       break;
     case OACC_HOST_DATA:
       ort = ORT_ACC_HOST_DATA;
@@ -11508,7 +11521,8 @@ gimplify_omp_workshare (tree *expr_p, gi
                             TREE_CODE (expr));
   if (TREE_CODE (expr) == OMP_TARGET)
     optimize_target_teams (expr, pre_p);
-  if ((ort & (ORT_TARGET | ORT_TARGET_DATA)) != 0)
+  if ((ort & (ORT_TARGET | ORT_TARGET_DATA)) != 0
+      || (ort & ORT_HOST_TEAMS) == ORT_HOST_TEAMS)
     {
       push_gimplify_context ();
       gimple *g = gimplify_and_return_first (OMP_BODY (expr), &body);
@@ -11579,6 +11593,8 @@ gimplify_omp_workshare (tree *expr_p, gi
       break;
     case OMP_TEAMS:
       stmt = gimple_build_omp_teams (body, OMP_CLAUSES (expr));
+      if ((ort & ORT_HOST_TEAMS) == ORT_HOST_TEAMS)
+       gimple_omp_teams_set_host (as_a <gomp_teams *> (stmt), true);
       break;
     default:
       gcc_unreachable ();
--- gcc/omp-builtins.def.jj     2018-06-18 19:02:46.415895871 +0200
+++ gcc/omp-builtins.def        2018-07-23 18:50:35.091799636 +0200
@@ -365,5 +365,7 @@ DEF_GOMP_BUILTIN (BUILT_IN_GOMP_TARGET_E
                  BT_FN_VOID_INT_SIZE_PTR_PTR_PTR_UINT_PTR, ATTR_NOTHROW_LIST)
 DEF_GOMP_BUILTIN (BUILT_IN_GOMP_TEAMS, "GOMP_teams",
                  BT_FN_VOID_UINT_UINT, ATTR_NOTHROW_LIST)
+DEF_GOMP_BUILTIN (BUILT_IN_GOMP_TEAMS_REG, "GOMP_teams_reg",
+                 BT_FN_VOID_OMPFN_PTR_UINT_UINT_UINT, ATTR_NOTHROW_LIST)
 DEF_GOACC_BUILTIN (BUILT_IN_GOACC_DECLARE, "GOACC_declare",
                   BT_FN_VOID_INT_SIZE_PTR_PTR_PTR, ATTR_NOTHROW_LIST)
--- gcc/omp-low.c.jj    2018-07-17 17:24:39.974318595 +0200
+++ gcc/omp-low.c       2018-07-24 18:11:57.201917314 +0200
@@ -280,12 +280,23 @@ is_taskloop_ctx (omp_context *ctx)
 }
 
 
-/* Return true if CTX is for an omp parallel or omp task.  */
+/* Return true if CTX is for a host omp teams.  */
+
+static inline bool
+is_host_teams_ctx (omp_context *ctx)
+{
+  return gimple_code (ctx->stmt) == GIMPLE_OMP_TEAMS
+        && gimple_omp_teams_host (as_a <gomp_teams *> (ctx->stmt));
+}
+
+/* Return true if CTX is for an omp parallel or omp task or host omp teams
+   (the last one is strictly not a task region in OpenMP speak, but we
+   need to treat it similarly).  */
 
 static inline bool
 is_taskreg_ctx (omp_context *ctx)
 {
-  return is_parallel_ctx (ctx) || is_task_ctx (ctx);
+  return is_parallel_ctx (ctx) || is_task_ctx (ctx) || is_host_teams_ctx (ctx);
 }
 
 /* Return true if EXPR is variable sized.  */
@@ -1011,8 +1022,10 @@ scan_sharing_clauses (tree clauses, omp_
 
        case OMP_CLAUSE_SHARED:
          decl = OMP_CLAUSE_DECL (c);
-         /* Ignore shared directives in teams construct.  */
-         if (gimple_code (ctx->stmt) == GIMPLE_OMP_TEAMS)
+         /* Ignore shared directives in teams construct inside of
+            target construct.  */
+         if (gimple_code (ctx->stmt) == GIMPLE_OMP_TEAMS
+             && !is_host_teams_ctx (ctx))
            {
              /* Global variables don't need to be copied,
                 the receiver side will use them directly.  */
@@ -1393,8 +1406,10 @@ scan_sharing_clauses (tree clauses, omp_
          break;
 
        case OMP_CLAUSE_SHARED:
-         /* Ignore shared directives in teams construct.  */
-         if (gimple_code (ctx->stmt) == GIMPLE_OMP_TEAMS)
+         /* Ignore shared directives in teams construct inside of
+            target construct.  */
+         if (gimple_code (ctx->stmt) == GIMPLE_OMP_TEAMS
+             && !is_host_teams_ctx (ctx))
            break;
          decl = OMP_CLAUSE_DECL (c);
          if (is_global_var (maybe_lookup_decl_in_outer_ctx (decl, ctx)))
@@ -1907,7 +1922,7 @@ finish_taskreg_scan (omp_context *ctx)
     return;
 
   /* If any task_shared_vars were needed, verify all
-     OMP_CLAUSE_SHARED clauses on GIMPLE_OMP_{PARALLEL,TASK}
+     OMP_CLAUSE_SHARED clauses on GIMPLE_OMP_{PARALLEL,TASK,TEAMS}
      statements if use_pointer_for_field hasn't changed
      because of that.  If it did, update field types now.  */
   if (task_shared_vars)
@@ -1951,7 +1966,8 @@ finish_taskreg_scan (omp_context *ctx)
          }
     }
 
-  if (gimple_code (ctx->stmt) == GIMPLE_OMP_PARALLEL)
+  if (gimple_code (ctx->stmt) == GIMPLE_OMP_PARALLEL
+      || gimple_code (ctx->stmt) == GIMPLE_OMP_TEAMS)
     {
       layout_type (ctx->record_type);
       fixup_child_record_type (ctx);
@@ -2331,8 +2347,32 @@ static void
 scan_omp_teams (gomp_teams *stmt, omp_context *outer_ctx)
 {
   omp_context *ctx = new_omp_context (stmt, outer_ctx);
+
+  if (!gimple_omp_teams_host (stmt))
+    {
+      scan_sharing_clauses (gimple_omp_teams_clauses (stmt), ctx);
+      scan_omp (gimple_omp_body_ptr (stmt), ctx);
+      return;
+    }
+  taskreg_contexts.safe_push (ctx);
+  gcc_assert (taskreg_nesting_level == 1);
+  ctx->field_map = splay_tree_new (splay_tree_compare_pointers, 0, 0);
+  ctx->record_type = lang_hooks.types.make_type (RECORD_TYPE);
+  tree name = create_tmp_var_name (".omp_data_s");
+  name = build_decl (gimple_location (stmt),
+                    TYPE_DECL, name, ctx->record_type);
+  DECL_ARTIFICIAL (name) = 1;
+  DECL_NAMELESS (name) = 1;
+  TYPE_NAME (ctx->record_type) = name;
+  TYPE_ARTIFICIAL (ctx->record_type) = 1;
+  create_omp_child_function (ctx, false);
+  gimple_omp_teams_set_child_fn (stmt, ctx->cb.dst_fn);
+
   scan_sharing_clauses (gimple_omp_teams_clauses (stmt), ctx);
   scan_omp (gimple_omp_body_ptr (stmt), ctx);
+
+  if (TYPE_FIELDS (ctx->record_type) == NULL)
+    ctx->record_type = ctx->receiver_decl = NULL;
 }
 
 /* Check nesting restrictions.  */
@@ -2817,13 +2857,20 @@ check_omp_nesting_restrictions (gimple *
       }
       break;
     case GIMPLE_OMP_TEAMS:
-      if (ctx == NULL
-         || gimple_code (ctx->stmt) != GIMPLE_OMP_TARGET
-         || gimple_omp_target_kind (ctx->stmt) != GF_OMP_TARGET_KIND_REGION)
-       {
+      if (ctx == NULL)
+       break;
+      else if (gimple_code (ctx->stmt) != GIMPLE_OMP_TARGET
+              || (gimple_omp_target_kind (ctx->stmt)
+                  != GF_OMP_TARGET_KIND_REGION))
+       {
+         /* Teams construct can appear either strictly nested inside of
+            target construct with no intervening stmts, or can be encountered
+            only by initial task (so must not appear inside any OpenMP
+            construct.  */
          error_at (gimple_location (stmt),
-                   "%<teams%> construct not closely nested inside of "
-                   "%<target%> construct");
+                   "%<teams%> construct must be closely nested inside of "
+                   "%<target%> construct or not nested in any OpenMP "
+                   "construct");
          return false;
        }
       break;
@@ -3107,7 +3154,14 @@ scan_omp_1_stmt (gimple_stmt_iterator *g
       break;
 
     case GIMPLE_OMP_TEAMS:
-      scan_omp_teams (as_a <gomp_teams *> (stmt), ctx);
+      if (gimple_omp_teams_host (as_a <gomp_teams *> (stmt)))
+       {
+         taskreg_nesting_level++;
+         scan_omp_teams (as_a <gomp_teams *> (stmt), ctx);
+         taskreg_nesting_level--;
+       }
+      else
+       scan_omp_teams (as_a <gomp_teams *> (stmt), ctx);
       break;
 
     case GIMPLE_BIND:
@@ -3595,8 +3649,10 @@ lower_rec_input_clauses (tree clauses, g
                continue;
              break;
            case OMP_CLAUSE_SHARED:
-             /* Ignore shared directives in teams construct.  */
-             if (gimple_code (ctx->stmt) == GIMPLE_OMP_TEAMS)
+             /* Ignore shared directives in teams construct inside
+                of target construct.  */
+             if (gimple_code (ctx->stmt) == GIMPLE_OMP_TEAMS
+                 && !is_host_teams_ctx (ctx))
                continue;
              if (maybe_lookup_decl (OMP_CLAUSE_DECL (c), ctx) == NULL)
                {
@@ -4077,8 +4133,10 @@ lower_rec_input_clauses (tree clauses, g
          switch (OMP_CLAUSE_CODE (c))
            {
            case OMP_CLAUSE_SHARED:
-             /* Ignore shared directives in teams construct.  */
-             if (gimple_code (ctx->stmt) == GIMPLE_OMP_TEAMS)
+             /* Ignore shared directives in teams construct inside
+                target construct.  */
+             if (gimple_code (ctx->stmt) == GIMPLE_OMP_TEAMS
+                 && !is_host_teams_ctx (ctx))
                continue;
              /* Shared global vars are just accessed directly.  */
              if (is_global_var (new_var))
@@ -8927,7 +8985,10 @@ lower_omp_1 (gimple_stmt_iterator *gsi_p
     case GIMPLE_OMP_TEAMS:
       ctx = maybe_lookup_ctx (stmt);
       gcc_assert (ctx);
-      lower_omp_teams (gsi_p, ctx);
+      if (gimple_omp_teams_host (as_a <gomp_teams *> (stmt)))
+       lower_omp_taskreg (gsi_p, ctx);
+      else
+       lower_omp_teams (gsi_p, ctx);
       break;
     case GIMPLE_OMP_GRID_BODY:
       ctx = maybe_lookup_ctx (stmt);
--- gcc/omp-expand.c.jj 2018-06-18 19:07:09.152186493 +0200
+++ gcc/omp-expand.c    2018-07-23 18:57:45.465491549 +0200
@@ -534,7 +534,7 @@ adjust_context_and_scope (tree entry_blo
     }
 }
 
-/* Build the function calls to GOMP_parallel_start etc to actually
+/* Build the function calls to GOMP_parallel etc to actually
    generate the parallel operation.  REGION is the parallel region
    being expanded.  BB is the block where to insert the code.  WS_ARGS
    will be set if this is a call to a combined parallel+workshare
@@ -889,6 +889,59 @@ expand_taskwait_call (basic_block bb, go
                            false, GSI_CONTINUE_LINKING);
 }
 
+/* Build the function call to GOMP_teams_reg to actually
+   generate the host teams operation.  REGION is the teams region
+   being expanded.  BB is the block where to insert the code.  */
+
+static void
+expand_teams_call (basic_block bb, gomp_teams *entry_stmt)
+{
+  tree clauses = gimple_omp_teams_clauses (entry_stmt);
+  tree num_teams = omp_find_clause (clauses, OMP_CLAUSE_NUM_TEAMS);
+  if (num_teams == NULL_TREE)
+    num_teams = build_int_cst (unsigned_type_node, 0);
+  else
+    {
+      num_teams = OMP_CLAUSE_NUM_TEAMS_EXPR (num_teams);
+      num_teams = fold_convert (unsigned_type_node, num_teams);
+    }
+  tree thread_limit = omp_find_clause (clauses, OMP_CLAUSE_THREAD_LIMIT);
+  if (thread_limit == NULL_TREE)
+    thread_limit = build_int_cst (unsigned_type_node, 0);
+  else
+    {
+      thread_limit = OMP_CLAUSE_THREAD_LIMIT_EXPR (thread_limit);
+      thread_limit = fold_convert (unsigned_type_node, thread_limit);
+    }
+
+  gimple_stmt_iterator gsi = gsi_last_nondebug_bb (bb);
+  tree t = gimple_omp_teams_data_arg (entry_stmt), t1;
+  if (t == NULL)
+    t1 = null_pointer_node;
+  else
+    t1 = build_fold_addr_expr (t);
+  tree child_fndecl = gimple_omp_teams_child_fn (entry_stmt);
+  tree t2 = build_fold_addr_expr (child_fndecl);
+
+  adjust_context_and_scope (gimple_block (entry_stmt), child_fndecl);
+
+  vec<tree, va_gc> *args;
+  vec_alloc (args, 5);
+  args->quick_push (t2);
+  args->quick_push (t1);
+  args->quick_push (num_teams);
+  args->quick_push (thread_limit);
+  /* For future extensibility.  */
+  args->quick_push (build_zero_cst (unsigned_type_node));
+
+  t = build_call_expr_loc_vec (UNKNOWN_LOCATION,
+                              builtin_decl_explicit (BUILT_IN_GOMP_TEAMS_REG),
+                              args);
+
+  force_gimple_operand_gsi (&gsi, t, true, NULL_TREE,
+                           false, GSI_CONTINUE_LINKING);
+}
+
 /* Chain all the DECLs in LIST by their TREE_CHAIN fields.  */
 
 static tree
@@ -1171,7 +1224,8 @@ expand_omp_taskreg (struct omp_region *r
 
       gsi = gsi_last_nondebug_bb (entry_bb);
       gcc_assert (gimple_code (gsi_stmt (gsi)) == GIMPLE_OMP_PARALLEL
-                 || gimple_code (gsi_stmt (gsi)) == GIMPLE_OMP_TASK);
+                 || gimple_code (gsi_stmt (gsi)) == GIMPLE_OMP_TASK
+                 || gimple_code (gsi_stmt (gsi)) == GIMPLE_OMP_TEAMS);
       gsi_remove (&gsi, true);
 
       new_bb = entry_bb;
@@ -1224,8 +1278,8 @@ expand_omp_taskreg (struct omp_region *r
                     effectively doing a STRIP_NOPS.  */
 
                  if (TREE_CODE (arg) == ADDR_EXPR
-                     && TREE_OPERAND (arg, 0)
-                       == gimple_omp_taskreg_data_arg (entry_stmt))
+                     && (TREE_OPERAND (arg, 0)
+                         == gimple_omp_taskreg_data_arg (entry_stmt)))
                    {
                      parcopy_stmt = stmt;
                      break;
@@ -1285,12 +1339,13 @@ expand_omp_taskreg (struct omp_region *r
       gsi = gsi_last_nondebug_bb (entry_bb);
       stmt = gsi_stmt (gsi);
       gcc_assert (stmt && (gimple_code (stmt) == GIMPLE_OMP_PARALLEL
-                          || gimple_code (stmt) == GIMPLE_OMP_TASK));
+                          || gimple_code (stmt) == GIMPLE_OMP_TASK
+                          || gimple_code (stmt) == GIMPLE_OMP_TEAMS));
       e = split_block (entry_bb, stmt);
       gsi_remove (&gsi, true);
       entry_bb = e->dest;
       edge e2 = NULL;
-      if (gimple_code (entry_stmt) == GIMPLE_OMP_PARALLEL)
+      if (gimple_code (entry_stmt) != GIMPLE_OMP_TASK)
        single_succ_edge (entry_bb)->flags = EDGE_FALLTHRU;
       else
        {
@@ -1416,6 +1471,8 @@ expand_omp_taskreg (struct omp_region *r
   if (gimple_code (entry_stmt) == GIMPLE_OMP_PARALLEL)
     expand_parallel_call (region, new_bb,
                          as_a <gomp_parallel *> (entry_stmt), ws_args);
+  else if (gimple_code (entry_stmt) == GIMPLE_OMP_TEAMS)
+    expand_teams_call (new_bb, as_a <gomp_teams *> (entry_stmt));
   else
     expand_task_call (region, new_bb, as_a <gomp_task *> (entry_stmt));
   if (gimple_in_ssa_p (cfun))
@@ -6038,6 +6095,12 @@ expand_omp_synch (struct omp_region *reg
              || gimple_code (gsi_stmt (si)) == GIMPLE_OMP_ORDERED
              || gimple_code (gsi_stmt (si)) == GIMPLE_OMP_CRITICAL
              || gimple_code (gsi_stmt (si)) == GIMPLE_OMP_TEAMS);
+  if (gimple_code (gsi_stmt (si)) == GIMPLE_OMP_TEAMS
+      && gimple_omp_teams_host (as_a <gomp_teams *> (gsi_stmt (si))))
+    {
+      expand_omp_taskreg (region);
+      return;
+    }
   gsi_remove (&si, true);
   single_succ_edge (entry_bb)->flags = EDGE_FALLTHRU;
 
--- gcc/c/c-parser.c.jj 2018-07-10 11:39:48.748566577 +0200
+++ gcc/c/c-parser.c    2018-07-24 18:19:26.526677938 +0200
@@ -17418,7 +17418,7 @@ c_parser_omp_teams (location_t loc, c_pa
          if (!flag_openmp)  /* flag_openmp_simd  */
            return c_parser_omp_distribute (loc, parser, p_name, mask,
                                            cclauses, if_p);
-         block = c_begin_compound_stmt (true);
+         block = c_begin_omp_parallel ();
          ret = c_parser_omp_distribute (loc, parser, p_name, mask, cclauses,
                                         if_p);
          block = c_end_compound_stmt (loc, block, true);
@@ -17430,6 +17430,7 @@ c_parser_omp_teams (location_t loc, c_pa
          OMP_TEAMS_CLAUSES (ret) = clauses;
          OMP_TEAMS_BODY (ret) = block;
          OMP_TEAMS_COMBINED (ret) = 1;
+         SET_EXPR_LOCATION (ret, loc);
          return add_stmt (ret);
        }
     }
@@ -17449,7 +17450,10 @@ c_parser_omp_teams (location_t loc, c_pa
   tree stmt = make_node (OMP_TEAMS);
   TREE_TYPE (stmt) = void_type_node;
   OMP_TEAMS_CLAUSES (stmt) = clauses;
-  OMP_TEAMS_BODY (stmt) = c_parser_omp_structured_block (parser, if_p);
+  block = c_begin_omp_parallel ();
+  add_stmt (c_parser_omp_structured_block (parser, if_p));
+  OMP_TEAMS_BODY (stmt) = c_end_compound_stmt (loc, block, true);
+  SET_EXPR_LOCATION (stmt, loc);
 
   return add_stmt (stmt);
 }
@@ -17870,6 +17874,7 @@ c_parser_omp_target (c_parser *parser, e
          OMP_TARGET_CLAUSES (stmt) = cclauses[C_OMP_CLAUSE_SPLIT_TARGET];
          OMP_TARGET_BODY (stmt) = block;
          OMP_TARGET_COMBINED (stmt) = 1;
+         SET_EXPR_LOCATION (stmt, loc);
          add_stmt (stmt);
          pc = &OMP_TARGET_CLAUSES (stmt);
          goto check_clauses;
--- gcc/cp/cp-tree.h.jj 2018-07-17 17:24:39.968318586 +0200
+++ gcc/cp/cp-tree.h    2018-07-25 11:43:23.689557550 +0200
@@ -6976,9 +6976,10 @@ extern tree finish_omp_for                       
(location_t
                                                 tree, tree, tree, tree, tree,
                                                 tree, tree, vec<tree> *, tree);
 extern tree finish_omp_for_block               (tree, tree);
-extern void finish_omp_atomic                  (enum tree_code, enum tree_code,
-                                                tree, tree, tree, tree, tree,
-                                                tree, enum omp_memory_order);
+extern void finish_omp_atomic                  (location_t, enum tree_code,
+                                                enum tree_code, tree, tree,
+                                                tree, tree, tree, tree,
+                                                enum omp_memory_order);
 extern void finish_omp_barrier                 (void);
 extern void finish_omp_depobj                  (location_t, tree,
                                                 enum omp_clause_depend_kind,
--- gcc/cp/parser.c.jj  2018-07-18 12:34:50.454933533 +0200
+++ gcc/cp/parser.c     2018-07-25 11:42:54.123510698 +0200
@@ -35211,8 +35211,8 @@ stmt_done:
     }
 done:
   clauses = finish_omp_clauses (clauses, C_ORT_OMP);
-  finish_omp_atomic (code, opcode, lhs, rhs, v, lhs1, rhs1, clauses,
-                    memory_order);
+  finish_omp_atomic (pragma_tok->location, code, opcode, lhs, rhs, v, lhs1,
+                    rhs1, clauses, memory_order);
   if (!structured_block)
     cp_parser_consume_semicolon_at_end_of_statement (parser);
   return;
@@ -36859,6 +36859,7 @@ cp_parser_omp_single (cp_parser *parser,
 {
   tree stmt = make_node (OMP_SINGLE);
   TREE_TYPE (stmt) = void_type_node;
+  SET_EXPR_LOCATION (stmt, pragma_tok->location);
 
   OMP_SINGLE_CLAUSES (stmt)
     = cp_parser_omp_all_clauses (parser, OMP_SINGLE_CLAUSE_MASK,
@@ -37182,6 +37183,7 @@ cp_parser_omp_teams (cp_parser *parser,
          if (!flag_openmp)  /* flag_openmp_simd  */
            return cp_parser_omp_distribute (parser, pragma_tok, p_name, mask,
                                             cclauses, if_p);
+         keep_next_level (true);
          sb = begin_omp_structured_block ();
          save = cp_parser_begin_omp_structured_block (parser);
          ret = cp_parser_omp_distribute (parser, pragma_tok, p_name, mask,
@@ -37217,6 +37219,7 @@ cp_parser_omp_teams (cp_parser *parser,
   tree stmt = make_node (OMP_TEAMS);
   TREE_TYPE (stmt) = void_type_node;
   OMP_TEAMS_CLAUSES (stmt) = clauses;
+  keep_next_level (true);
   OMP_TEAMS_BODY (stmt) = cp_parser_omp_structured_block (parser, if_p);
   SET_EXPR_LOCATION (stmt, loc);
 
--- gcc/cp/semantics.c.jj       2018-07-17 17:24:39.969318588 +0200
+++ gcc/cp/semantics.c  2018-07-25 11:41:47.760405527 +0200
@@ -8748,9 +8748,9 @@ finish_omp_for_block (tree bind, tree om
 }
 
 void
-finish_omp_atomic (enum tree_code code, enum tree_code opcode, tree lhs,
-                  tree rhs, tree v, tree lhs1, tree rhs1, tree clauses,
-                  enum omp_memory_order mo)
+finish_omp_atomic (location_t loc, enum tree_code code, enum tree_code opcode,
+                  tree lhs, tree rhs, tree v, tree lhs1, tree rhs1,
+                  tree clauses, enum omp_memory_order mo)
 {
   tree orig_lhs;
   tree orig_rhs;
@@ -8827,7 +8827,7 @@ finish_omp_atomic (enum tree_code code,
                   "expressions for memory");
          return;
        }
-      stmt = c_finish_omp_atomic (input_location, code, opcode, lhs, rhs,
+      stmt = c_finish_omp_atomic (loc, code, opcode, lhs, rhs,
                                  v, lhs1, rhs1, swapped, mo,
                                  processing_template_decl != 0);
       if (stmt == error_mark_node)
@@ -8837,8 +8837,7 @@ finish_omp_atomic (enum tree_code code,
     {
       if (code == OMP_ATOMIC_READ)
        {
-         stmt = build_min_nt_loc (EXPR_LOCATION (orig_lhs),
-                                  OMP_ATOMIC_READ, orig_lhs);
+         stmt = build_min_nt_loc (loc, OMP_ATOMIC_READ, orig_lhs);
          OMP_ATOMIC_MEMORY_ORDER (stmt) = mo;
          stmt = build2 (MODIFY_EXPR, void_type_node, orig_v, stmt);
        }
@@ -8853,8 +8852,7 @@ finish_omp_atomic (enum tree_code code,
                                     COMPOUND_EXPR, orig_rhs1, stmt);
          if (code != OMP_ATOMIC)
            {
-             stmt = build_min_nt_loc (EXPR_LOCATION (orig_lhs1),
-                                      code, orig_lhs1, stmt);
+             stmt = build_min_nt_loc (loc, code, orig_lhs1, stmt);
              OMP_ATOMIC_MEMORY_ORDER (stmt) = mo;
              stmt = build2 (MODIFY_EXPR, void_type_node, orig_v, stmt);
            }
@@ -8862,6 +8860,7 @@ finish_omp_atomic (enum tree_code code,
       stmt = build2 (OMP_ATOMIC, void_type_node,
                     clauses ? clauses : integer_zero_node, stmt);
       OMP_ATOMIC_MEMORY_ORDER (stmt) = mo;
+      SET_EXPR_LOCATION (stmt, loc);
     }
   finish_expr_stmt (stmt);
 }
--- gcc/cp/pt.c.jj      2018-07-18 12:34:50.459933538 +0200
+++ gcc/cp/pt.c 2018-07-25 11:44:17.162642291 +0200
@@ -17334,9 +17334,19 @@ tsubst_expr (tree t, tree args, tsubst_f
                                          && OMP_TEAMS_COMBINED (t));
       tmp = tsubst_omp_clauses (OMP_CLAUSES (t), C_ORT_OMP, args, complain,
                                in_decl);
-      stmt = push_stmt_list ();
-      RECUR (OMP_BODY (t));
-      stmt = pop_stmt_list (stmt);
+      if (TREE_CODE (t) == OMP_TEAMS)
+       {
+         keep_next_level (true);
+         stmt = begin_omp_structured_block ();
+         RECUR (OMP_BODY (t));
+         stmt = finish_omp_structured_block (stmt);
+       }
+      else
+       {
+         stmt = push_stmt_list ();
+         RECUR (OMP_BODY (t));
+         stmt = pop_stmt_list (stmt);
+       }
 
       t = copy_node (t);
       OMP_BODY (t) = stmt;
@@ -17489,8 +17499,8 @@ tsubst_expr (tree t, tree args, tsubst_f
            }
          lhs = RECUR (TREE_OPERAND (op1, 0));
          rhs = RECUR (TREE_OPERAND (op1, 1));
-         finish_omp_atomic (OMP_ATOMIC, TREE_CODE (op1), lhs, rhs,
-                            NULL_TREE, NULL_TREE, rhs1, tmp,
+         finish_omp_atomic (EXPR_LOCATION (t), OMP_ATOMIC, TREE_CODE (op1),
+                            lhs, rhs, NULL_TREE, NULL_TREE, rhs1, tmp,
                             OMP_ATOMIC_MEMORY_ORDER (t));
        }
       else
@@ -17528,8 +17538,8 @@ tsubst_expr (tree t, tree args, tsubst_f
              lhs = RECUR (TREE_OPERAND (op1, 0));
              rhs = RECUR (TREE_OPERAND (op1, 1));
            }
-         finish_omp_atomic (code, opcode, lhs, rhs, v, lhs1, rhs1, tmp,
-                            OMP_ATOMIC_MEMORY_ORDER (t));
+         finish_omp_atomic (EXPR_LOCATION (t), code, opcode, lhs, rhs, v,
+                            lhs1, rhs1, tmp, OMP_ATOMIC_MEMORY_ORDER (t));
        }
       break;
 
--- gcc/testsuite/c-c++-common/gomp/teams-1.c.jj        2018-07-24 
16:30:47.794593236 +0200
+++ gcc/testsuite/c-c++-common/gomp/teams-1.c   2018-07-24 18:49:20.149857294 
+0200
@@ -0,0 +1,64 @@
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+int omp_get_num_teams (void);
+int omp_get_team_num (void);
+
+#ifdef __cplusplus
+}
+#endif
+
+void bar (int *, int *, int *, int, int, int, int);
+
+void
+foo (void)
+{
+  int a = 1, b = 2, c = 3, d = 4, e = 5, f = 6;
+  #pragma omp teams num_teams (4) shared (b) firstprivate (c, d) private (e, f)
+  {
+    f = 7;
+    bar (&a, &c, &e, b, d, f, 0);
+  }
+  bar (&a, (int *) 0, (int *) 0, b, 0, 0, 1);
+}
+
+void
+baz (void)
+{
+  #pragma omp teams
+  {
+    #pragma omp distribute
+    for (int i = 0; i < 64; i++)
+      ;
+    #pragma omp distribute simd
+    for (int i = 0; i < 64; i++)
+      ;
+    #pragma omp distribute parallel for
+    for (int i = 0; i < 64; i++)
+      ;
+    #pragma omp distribute parallel for
+    for (int i = 0; i < 64; i++)
+      ;
+    #pragma omp distribute parallel for simd
+    for (int i = 0; i < 64; i++)
+      ;
+    #pragma omp parallel
+    ;
+    #pragma omp parallel for
+    for (int i = 0; i < 64; i++)
+      ;
+    #pragma omp parallel for simd
+    for (int i = 0; i < 64; i++)
+      ;
+    int a, b;
+    #pragma omp parallel sections
+    {
+      a = 5;
+      #pragma omp section
+      b = 6;
+    }
+    int c = omp_get_num_teams ();
+    int d = omp_get_team_num ();
+  }
+}
--- gcc/testsuite/c-c++-common/gomp/teams-2.c.jj        2018-07-24 
16:46:13.124082656 +0200
+++ gcc/testsuite/c-c++-common/gomp/teams-2.c   2018-07-25 11:21:36.586131565 
+0200
@@ -0,0 +1,119 @@
+void
+foo (void)
+{
+  int i;
+
+  #pragma omp parallel
+  {
+    #pragma omp teams  /* { dg-error "'teams' construct must be closely nested 
inside of 'target' construct or not nested in any OpenMP construct" } */
+    ;
+  }
+  #pragma omp teams
+  {
+    #pragma omp teams  /* { dg-error "only 'distribute' or 'parallel' regions 
are allowed to be strictly nested inside 'teams' region" } */
+    ;
+  }
+  #pragma omp target
+  {
+    #pragma omp parallel
+    {
+      #pragma omp teams        /* { dg-error "'teams' construct must be 
closely nested inside of 'target' construct or not nested in any OpenMP 
construct" } */
+      ;
+    }
+  }
+  #pragma omp for
+  for (i = 0; i < 4; i++)
+    if (i == 0)
+      {
+       #pragma omp teams       /* { dg-error "'teams' construct must be 
closely nested inside of 'target' construct or not nested in any OpenMP 
construct" } */
+       ;
+      }
+  #pragma omp single
+  #pragma omp teams    /* { dg-error "'teams' construct must be closely nested 
inside of 'target' construct or not nested in any OpenMP construct" } */
+  ;
+  #pragma omp master
+  {
+    #pragma omp teams  /* { dg-error "'teams' construct must be closely nested 
inside of 'target' construct or not nested in any OpenMP construct" } */
+    ;
+  }
+  #pragma omp critical
+  #pragma omp teams    /* { dg-error "'teams' construct must be closely nested 
inside of 'target' construct or not nested in any OpenMP construct" } */
+  ;
+  #pragma omp sections
+  {
+    #pragma omp teams  /* { dg-error "'teams' construct must be closely nested 
inside of 'target' construct or not nested in any OpenMP construct" } */
+    ;
+    #pragma omp section
+    {
+      #pragma omp teams        /* { dg-error "'teams' construct must be 
closely nested inside of 'target' construct or not nested in any OpenMP 
construct" } */
+      ;
+    }
+  }
+  #pragma omp target data map (to: i)
+  {
+    #pragma omp teams  /* { dg-error "'teams' construct must be closely nested 
inside of 'target' construct or not nested in any OpenMP construct" } */
+    ;
+  }
+  #pragma omp task
+  {
+    #pragma omp teams  /* { dg-error "'teams' construct must be closely nested 
inside of 'target' construct or not nested in any OpenMP construct" } */
+    ;
+  }
+  #pragma omp taskgroup
+  {
+    #pragma omp teams  /* { dg-error "'teams' construct must be closely nested 
inside of 'target' construct or not nested in any OpenMP construct" } */
+    ;
+  }
+}
+
+void
+bar (void)
+{
+  #pragma omp teams
+  {
+    int x, y, v = 4;
+    #pragma omp target                 /* { dg-error "only 'distribute' or 
'parallel' regions are allowed to be strictly nested inside 'teams' region" } */
+    ;
+    #pragma omp target data map (to: v)        /* { dg-error "only 
'distribute' or 'parallel' regions are allowed to be strictly nested inside 
'teams' region" } */
+    ;
+    #pragma omp for                    /* { dg-error "only 'distribute' or 
'parallel' regions are allowed to be strictly nested inside 'teams' region" } */
+    for (int i = 0; i < 64; ++i)
+      ;
+    #pragma omp simd                   /* { dg-error "only 'distribute' or 
'parallel' regions are allowed to be strictly nested inside 'teams' region" } */
+    for (int i = 0; i < 64; ++i)
+      ;
+    #pragma omp for simd               /* { dg-error "only 'distribute' or 
'parallel' regions are allowed to be strictly nested inside 'teams' region" } */
+    for (int i = 0; i < 64; ++i)
+      ;
+    #pragma omp single                 /* { dg-error "only 'distribute' or 
'parallel' regions are allowed to be strictly nested inside 'teams' region" } */
+    ;
+    #pragma omp master                 /* { dg-error "only 'distribute' or 
'parallel' regions are allowed to be strictly nested inside 'teams' region" } */
+    ;
+    #pragma omp sections               /* { dg-error "only 'distribute' or 
'parallel' regions are allowed to be strictly nested inside 'teams' region" } */
+    {
+      x = 1;
+      #pragma omp section
+      y = 2;
+    }
+    #pragma omp critical               /* { dg-error "only 'distribute' or 
'parallel' regions are allowed to be strictly nested inside 'teams' region" } */
+    ;
+    #pragma omp target enter data map (to: v)  /* { dg-error "only 
'distribute' or 'parallel' regions are allowed to be strictly nested inside 
'teams' region" } */
+    #pragma omp target exit data map (from: v) /* { dg-error "only 
'distribute' or 'parallel' regions are allowed to be strictly nested inside 
'teams' region" } */
+    #pragma omp cancel parallel                /* { dg-error "only 
'distribute' or 'parallel' regions are allowed to be strictly nested inside 
'teams' region" } */
+    #pragma omp cancellation point parallel /* { dg-error "only 'distribute' 
or 'parallel' regions are allowed to be strictly nested inside 'teams' region" 
} */
+    #pragma omp barrier                        /* { dg-error "only 
'distribute' or 'parallel' regions are allowed to be strictly nested inside 
'teams' region" } */
+    #pragma omp ordered                        /* { dg-error "only 
'distribute' or 'parallel' regions are allowed to be strictly nested inside 
'teams' region" } */
+    ;
+    #pragma omp task                   /* { dg-error "only 'distribute' or 
'parallel' regions are allowed to be strictly nested inside 'teams' region" } */
+    ;
+    #pragma omp taskloop               /* { dg-error "only 'distribute' or 
'parallel' regions are allowed to be strictly nested inside 'teams' region" } */
+    for (int i = 0; i < 64; ++i)
+      ;
+    #pragma omp atomic                 /* { dg-error "only 'distribute' or 
'parallel' regions are allowed to be strictly nested inside 'teams' region" } */
+    v++;
+    #pragma omp taskgroup              /* { dg-error "only 'distribute' or 
'parallel' regions are allowed to be strictly nested inside 'teams' region" } */
+    ;
+    #pragma omp taskwait               /* { dg-error "only 'distribute' or 
'parallel' regions are allowed to be strictly nested inside 'teams' region" } */
+    #pragma omp taskyield              /* { dg-error "only 'distribute' or 
'parallel' regions are allowed to be strictly nested inside 'teams' region" } */
+  }
+}
--- gcc/testsuite/g++.dg/gomp/tpl-atomic-2.C.jj 2017-05-04 15:05:46.028701260 
+0200
+++ gcc/testsuite/g++.dg/gomp/tpl-atomic-2.C    2018-07-25 11:58:54.329329166 
+0200
@@ -6,21 +6,21 @@ struct S { int x; } s;
 // even when the templates are never instantiated.
 template<typename T> void f1()
 {
-  #pragma omp atomic
-  s += 1;              // { dg-error "invalid" }
+  #pragma omp atomic   // { dg-error "invalid" }
+  s += 1;
 }
 
 template<typename T> void f2(float *f)
 {
-  #pragma omp atomic
-  *f |= 1;             // { dg-error "invalid|evaluation" }
+  #pragma omp atomic   // { dg-error "invalid" }
+  *f |= 1;             // { dg-error "evaluation" }
 }
 
 // Here the rhs is dependent, but not type dependent.
 template<typename T> void f3(float *f)
 {
-  #pragma omp atomic
-  *f |= sizeof (T);    // { dg-error "invalid|evaluation" }
+  #pragma omp atomic   // { dg-error "invalid" }
+  *f |= sizeof (T);    // { dg-error "evaluation" }
 }
 
 // And the converse, no error here because we're never fed a T.
@@ -35,6 +35,6 @@ template<typename T> void f4(T *t)
 // of the semantic analysis concurrent with that.
 template<typename T> void f5(float *f)
 {
-  #pragma omp atomic
-  *f |= (T)sizeof(T);  // { dg-error "invalid|evaluation" "" { xfail *-*-* } }
+  #pragma omp atomic   // { dg-error "invalid" "" { xfail *-*-* } }
+  *f |= (T)sizeof(T);  // { dg-error "evaluation" "" { xfail *-*-* } }
 }
--- gcc/testsuite/gcc.dg/gomp/teams-1.c.jj      2017-05-04 15:05:34.813844816 
+0200
+++ gcc/testsuite/gcc.dg/gomp/teams-1.c 2018-07-24 18:30:08.809798941 +0200
@@ -23,8 +23,8 @@ foo (int x)
 
   switch (x) // { dg-error "invalid entry to OpenMP structured block" }
   {
-  #pragma omp target teams
-    { case 0:; } // { dg-warning "statement will never be executed" }
+  #pragma omp target teams // { dg-warning "statement will never be executed" }
+    { case 0:; }
   }
 }
 
--- libgomp/Makefile.am.jj      2018-05-21 17:21:48.717963247 +0200
+++ libgomp/Makefile.am 2018-07-25 15:42:14.319714084 +0200
@@ -64,7 +64,7 @@ libgomp_la_SOURCES = alloc.c atomic.c ba
        proc.c sem.c bar.c ptrlock.c time.c fortran.c affinity.c target.c \
        splay-tree.c libgomp-plugin.c oacc-parallel.c oacc-host.c oacc-init.c \
        oacc-mem.c oacc-async.c oacc-plugin.c oacc-cuda.c priority_queue.c \
-       affinity-fmt.c
+       affinity-fmt.c teams.c
 
 include $(top_srcdir)/plugin/Makefrag.am
 
--- libgomp/libgomp_g.h.jj      2018-06-18 19:02:46.477895938 +0200
+++ libgomp/libgomp_g.h 2018-07-25 15:23:32.075395047 +0200
@@ -294,6 +294,11 @@ extern void GOMP_target_enter_exit_data
                                         void **);
 extern void GOMP_teams (unsigned int, unsigned int);
 
+/* teams.c */
+
+extern void GOMP_teams_reg (void (*) (void *), void *, unsigned, unsigned,
+                           unsigned);
+
 /* oacc-parallel.c */
 
 extern void GOACC_parallel_keyed (int, void (*) (void *), size_t,
--- libgomp/libgomp.map.jj      2018-07-10 11:31:16.185088609 +0200
+++ libgomp/libgomp.map 2018-07-25 15:50:38.595714295 +0200
@@ -313,6 +313,7 @@ GOMP_4.5 {
 GOMP_5.0 {
   global:
        GOMP_taskwait_depend;
+       GOMP_teams_reg;
 } GOMP_4.5;
 
 OACC_2.0 {
--- libgomp/icv-device.c.jj     2018-04-30 13:21:06.082866155 +0200
+++ libgomp/icv-device.c        2018-07-25 15:34:34.058747525 +0200
@@ -49,20 +49,6 @@ omp_get_num_devices (void)
 }
 
 int
-omp_get_num_teams (void)
-{
-  /* Hardcoded to 1 on host, MIC, HSAIL?  Maybe variable on PTX.  */
-  return 1;
-}
-
-int
-omp_get_team_num (void)
-{
-  /* Hardcoded to 0 on host, MIC, HSAIL?  Maybe variable on PTX.  */
-  return 0;
-}
-
-int
 omp_is_initial_device (void)
 {
   /* Hardcoded to 1 on host, should be 0 on MIC, HSAIL, PTX.  */
@@ -72,6 +58,4 @@ omp_is_initial_device (void)
 ialias (omp_set_default_device)
 ialias (omp_get_default_device)
 ialias (omp_get_num_devices)
-ialias (omp_get_num_teams)
-ialias (omp_get_team_num)
 ialias (omp_is_initial_device)
--- libgomp/teams.c.jj  2018-07-25 15:15:36.185482035 +0200
+++ libgomp/teams.c     2018-07-25 16:20:04.969409319 +0200
@@ -0,0 +1,73 @@
+/* Copyright (C) 2018 Free Software Foundation, Inc.
+   Contributed by Jakub Jelinek <ja...@redhat.com>.
+
+   This file is part of the GNU Offloading and Multi Processing Library
+   (libgomp).
+
+   Libgomp is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+   more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* This file handles the host TEAMS construct.  */
+
+#include "libgomp.h"
+
+static unsigned gomp_num_teams = 1, gomp_team_num = 0;
+
+void
+GOMP_teams_reg (void (*fn) (void *), void *data, unsigned int num_teams,
+               unsigned int thread_limit, unsigned int flags)
+{
+  (void) flags;
+  (void) num_teams;
+  unsigned old_thread_limit_var = 0;
+  if (thread_limit)
+    {
+      struct gomp_task_icv *icv = gomp_icv (true);
+      old_thread_limit_var = icv->thread_limit_var;
+      icv->thread_limit_var
+       = thread_limit > INT_MAX ? UINT_MAX : thread_limit;
+    }
+  if (num_teams == 0)
+    num_teams = 3;
+  gomp_num_teams = num_teams;
+  for (gomp_team_num = 0; gomp_team_num < num_teams; gomp_team_num++)
+    fn (data);
+  gomp_num_teams = 1;
+  gomp_team_num = 0;
+  if (thread_limit)
+    {
+      struct gomp_task_icv *icv = gomp_icv (true);
+      icv->thread_limit_var = old_thread_limit_var;
+    }
+}
+
+int
+omp_get_num_teams (void)
+{
+  return gomp_num_teams;
+}
+
+int
+omp_get_team_num (void)
+{
+  return gomp_team_num;
+}
+
+ialias (omp_get_num_teams)
+ialias (omp_get_team_num)
--- libgomp/config/nvptx/icv-device.c.jj        2018-04-30 13:20:55.225861793 
+0200
+++ libgomp/config/nvptx/icv-device.c   2018-07-25 15:36:44.522027027 +0200
@@ -46,20 +46,6 @@ omp_get_num_devices (void)
 }
 
 int
-omp_get_num_teams (void)
-{
-  return gomp_num_teams_var + 1;
-}
-
-int
-omp_get_team_num (void)
-{
-  int ctaid;
-  asm ("mov.u32 %0, %%ctaid.x;" : "=r" (ctaid));
-  return ctaid;
-}
-
-int
 omp_is_initial_device (void)
 {
   /* NVPTX is an accelerator-only target.  */
@@ -69,6 +55,4 @@ omp_is_initial_device (void)
 ialias (omp_set_default_device)
 ialias (omp_get_default_device)
 ialias (omp_get_num_devices)
-ialias (omp_get_num_teams)
-ialias (omp_get_team_num)
 ialias (omp_is_initial_device)
--- libgomp/config/nvptx/teams.c.jj     2018-07-25 15:34:52.363787032 +0200
+++ libgomp/config/nvptx/teams.c        2018-07-25 15:36:21.255978868 +0200
@@ -0,0 +1,57 @@
+/* Copyright (C) 2015-2018 Free Software Foundation, Inc.
+   Contributed by Alexander Monakov <amona...@ispras.ru>
+
+   This file is part of the GNU Offloading and Multi Processing Library
+   (libgomp).
+
+   Libgomp is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   Libgomp is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+   FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+   more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* This file defines OpenMP API entry points that accelerator targets are
+   expected to replace.  */
+
+#include "libgomp.h"
+
+void
+GOMP_teams_reg (void (*fn) (void *), void *data, unsigned int num_teams,
+               unsigned int thread_limit, unsigned int flags)
+{
+  (void) fn;
+  (void) data;
+  (void) flags;
+  (void) num_teams;
+  (void) thread_limit;
+}
+
+int
+omp_get_num_teams (void)
+{
+  return gomp_num_teams_var + 1;
+}
+
+int
+omp_get_team_num (void)
+{
+  int ctaid;
+  asm ("mov.u32 %0, %%ctaid.x;" : "=r" (ctaid));
+  return ctaid;
+}
+
+ialias (omp_get_num_teams)
+ialias (omp_get_team_num)
--- libgomp/testsuite/libgomp.c++/for-16.C.jj   2018-07-25 17:03:57.474067497 
+0200
+++ libgomp/testsuite/libgomp.c++/for-16.C      2018-07-25 17:05:08.887219360 
+0200
@@ -0,0 +1,218 @@
+// PR c++/86443
+// { dg-do run }
+// { dg-additional-options "-std=c++17" }
+
+typedef __PTRDIFF_TYPE__ ptrdiff_t;
+extern "C" void abort ();
+
+template <typename T>
+class I
+{
+public:
+  typedef ptrdiff_t difference_type;
+  I ();
+  ~I ();
+  I (T *);
+  I (const I &);
+  T &operator * ();
+  T *operator -> ();
+  T &operator [] (const difference_type &) const;
+  I &operator = (const I &);
+  I &operator ++ ();
+  I operator ++ (int);
+  I &operator -- ();
+  I operator -- (int);
+  I &operator += (const difference_type &);
+  I &operator -= (const difference_type &);
+  I operator + (const difference_type &) const;
+  I operator - (const difference_type &) const;
+  template <typename S> friend bool operator == (I<S> &, I<S> &);
+  template <typename S> friend bool operator == (const I<S> &, const I<S> &);
+  template <typename S> friend bool operator < (I<S> &, I<S> &);
+  template <typename S> friend bool operator < (const I<S> &, const I<S> &);
+  template <typename S> friend bool operator <= (I<S> &, I<S> &);
+  template <typename S> friend bool operator <= (const I<S> &, const I<S> &);
+  template <typename S> friend bool operator > (I<S> &, I<S> &);
+  template <typename S> friend bool operator > (const I<S> &, const I<S> &);
+  template <typename S> friend bool operator >= (I<S> &, I<S> &);
+  template <typename S> friend bool operator >= (const I<S> &, const I<S> &);
+  template <typename S> friend typename I<S>::difference_type operator - (I<S> 
&, I<S> &);
+  template <typename S> friend typename I<S>::difference_type operator - 
(const I<S> &, const I<S> &);
+  template <typename S> friend I<S> operator + (typename I<S>::difference_type 
, const I<S> &);
+private:
+  T *p;
+};
+template <typename T> I<T>::I () : p (0) {}
+template <typename T> I<T>::~I () {}
+template <typename T> I<T>::I (T *x) : p (x) {}
+template <typename T> I<T>::I (const I &x) : p (x.p) {}
+template <typename T> T &I<T>::operator * () { return *p; }
+template <typename T> T *I<T>::operator -> () { return p; }
+template <typename T> T &I<T>::operator [] (const difference_type &x) const { 
return p[x]; }
+template <typename T> I<T> &I<T>::operator = (const I &x) { p = x.p; return 
*this; }
+template <typename T> I<T> &I<T>::operator ++ () { ++p; return *this; }
+template <typename T> I<T> I<T>::operator ++ (int) { return I (p++); }
+template <typename T> I<T> &I<T>::operator -- () { --p; return *this; }
+template <typename T> I<T> I<T>::operator -- (int) { return I (p--); }
+template <typename T> I<T> &I<T>::operator += (const difference_type &x) { p 
+= x; return *this; }
+template <typename T> I<T> &I<T>::operator -= (const difference_type &x) { p 
-= x; return *this; }
+template <typename T> I<T> I<T>::operator + (const difference_type &x) const { 
return I (p + x); }
+template <typename T> I<T> I<T>::operator - (const difference_type &x) const { 
return I (p - x); }
+template <typename T> bool operator == (I<T> &x, I<T> &y) { return x.p == y.p; 
}
+template <typename T> bool operator == (const I<T> &x, const I<T> &y) { return 
x.p == y.p; }
+template <typename T> bool operator != (I<T> &x, I<T> &y) { return !(x == y); }
+template <typename T> bool operator != (const I<T> &x, const I<T> &y) { return 
!(x == y); }
+template <typename T> bool operator < (I<T> &x, I<T> &y) { return x.p < y.p; }
+template <typename T> bool operator < (const I<T> &x, const I<T> &y) { return 
x.p < y.p; }
+template <typename T> bool operator <= (I<T> &x, I<T> &y) { return x.p <= y.p; 
}
+template <typename T> bool operator <= (const I<T> &x, const I<T> &y) { return 
x.p <= y.p; }
+template <typename T> bool operator > (I<T> &x, I<T> &y) { return x.p > y.p; }
+template <typename T> bool operator > (const I<T> &x, const I<T> &y) { return 
x.p > y.p; }
+template <typename T> bool operator >= (I<T> &x, I<T> &y) { return x.p >= y.p; 
}
+template <typename T> bool operator >= (const I<T> &x, const I<T> &y) { return 
x.p >= y.p; }
+template <typename T> typename I<T>::difference_type operator - (I<T> &x, I<T> 
&y) { return x.p - y.p; }
+template <typename T> typename I<T>::difference_type operator - (const I<T> 
&x, const I<T> &y) { return x.p - y.p; }
+template <typename T> I<T> operator + (typename I<T>::difference_type x, const 
I<T> &y) { return I<T> (x + y.p); }
+
+template <typename T>
+class J
+{
+public:
+  J(const I<T> &x, const I<T> &y) : b (x), e (y) {}
+  const I<T> &begin ();
+  const I<T> &end ();
+private:
+  I<T> b, e;
+};
+
+template <typename T> const I<T> &J<T>::begin () { return b; }
+template <typename T> const I<T> &J<T>::end () { return e; }
+
+int results[2000];
+
+template <typename T>
+void
+baz (I<T> &i)
+{
+  if (*i < 0 || *i >= 2000)
+    abort ();
+  results[*i]++;
+}
+
+void
+baz (int i)
+{
+  if (i < 0 || i >= 2000)
+    abort ();
+  results[i]++;
+}
+
+void
+f1 (J<int> j)
+{
+#pragma omp distribute parallel for default(none)
+  for (I<int> i = j.begin (); i < j.end (); i += 3)
+    baz (*i);
+}
+
+void
+f2 (J<int> j)
+{
+  I<int> i;
+#pragma omp distribute parallel for default(none)
+  for (i = j.begin (); i < j.end (); ++i)
+    baz (*i);
+}
+
+template <int N>
+void
+f3 (J<int> j)
+{
+#pragma omp distribute parallel for default(none)
+  for (I<int> i = j.begin (); i < j.end (); i += 6)
+    baz (*i);
+}
+
+template <int N>
+void
+f4 (J<int> j)
+{
+  I<int> i;
+#pragma omp distribute parallel for default(none)
+  for (i = j.begin (); i < j.end (); i += 9)
+    baz (*i);
+}
+
+template <typename T>
+void
+f5 (J<T> j)
+{
+#pragma omp distribute parallel for default(none)
+  for (I<T> i = j.begin (); i < j.end (); i += 4)
+    baz (*i);
+}
+
+template <typename T>
+void
+f6 (J<T> j)
+{
+  I<T> i;
+#pragma omp distribute parallel for default(none)
+  for (i = j.begin (); i < j.end (); i += 7)
+    baz (*i);
+}
+
+#define check(expr) \
+  for (int i = 0; i < 2000; i++)                       \
+    if (expr)                                          \
+      {                                                        \
+       if (results[i] != 1)                            \
+         abort ();                                     \
+       results[i] = 0;                                 \
+      }                                                        \
+    else if (results[i])                               \
+      abort ()
+
+int
+main ()
+{
+  int a[2000];
+  for (int i = 0; i < 2000; i++)
+    a[i] = i;
+  #pragma omp teams
+  {
+    J<int> j (&a[75], &a[1945]);
+    f1 (j);
+  }
+  check (i >= 75 && i < 1945 && (i - 75) % 3 == 0);
+  #pragma omp teams
+  {
+    J<int> j (&a[63], &a[1949]);
+    f2 (j);
+  }
+  check (i >= 63 && i < 1949);
+  #pragma omp teams
+  {
+    J<int> j (&a[58], &a[1979]);
+    f3 <2> (j);
+  }
+  check (i >= 58 && i < 1979 && (i - 58) % 6 == 0);
+  #pragma omp teams
+  {
+    J<int> j (&a[59], &a[1981]);
+    f4 <9> (j);
+  }
+  check (i >= 59 && i < 1981 && (i - 59) % 9 == 0);
+  #pragma omp teams
+  {
+    J<int> j (&a[52], &a[1972]);
+    f5 (j);
+  }
+  check (i >= 52 && i < 1972 && (i - 52) % 4 == 0);
+  #pragma omp teams
+  {
+    J<int> j (&a[31], &a[1827]);
+    f6 (j);
+  }
+  check (i >= 31 && i < 1827 && (i - 31) % 7 == 0);
+}
--- libgomp/testsuite/libgomp.c++/for-26.C.jj   2018-07-25 17:07:36.043532790 
+0200
+++ libgomp/testsuite/libgomp.c++/for-26.C      2018-07-25 17:08:50.619691628 
+0200
@@ -0,0 +1,422 @@
+// { dg-do run }
+// { dg-additional-options "-std=c++17" }
+
+typedef __PTRDIFF_TYPE__ ptrdiff_t;
+extern "C" void abort ();
+
+namespace std {
+  template<typename T> struct tuple_size;
+  template<int, typename> struct tuple_element;
+}
+
+template <typename T>
+class I
+{
+public:
+  typedef ptrdiff_t difference_type;
+  I ();
+  ~I ();
+  I (T *);
+  I (const I &);
+  T &operator * ();
+  T *operator -> ();
+  T &operator [] (const difference_type &) const;
+  I &operator = (const I &);
+  I &operator ++ ();
+  I operator ++ (int);
+  I &operator -- ();
+  I operator -- (int);
+  I &operator += (const difference_type &);
+  I &operator -= (const difference_type &);
+  I operator + (const difference_type &) const;
+  I operator - (const difference_type &) const;
+  template <typename S> friend bool operator == (I<S> &, I<S> &);
+  template <typename S> friend bool operator == (const I<S> &, const I<S> &);
+  template <typename S> friend bool operator < (I<S> &, I<S> &);
+  template <typename S> friend bool operator < (const I<S> &, const I<S> &);
+  template <typename S> friend bool operator <= (I<S> &, I<S> &);
+  template <typename S> friend bool operator <= (const I<S> &, const I<S> &);
+  template <typename S> friend bool operator > (I<S> &, I<S> &);
+  template <typename S> friend bool operator > (const I<S> &, const I<S> &);
+  template <typename S> friend bool operator >= (I<S> &, I<S> &);
+  template <typename S> friend bool operator >= (const I<S> &, const I<S> &);
+  template <typename S> friend typename I<S>::difference_type operator - (I<S> 
&, I<S> &);
+  template <typename S> friend typename I<S>::difference_type operator - 
(const I<S> &, const I<S> &);
+  template <typename S> friend I<S> operator + (typename I<S>::difference_type 
, const I<S> &);
+private:
+  T *p;
+};
+template <typename T> I<T>::I () : p (0) {}
+template <typename T> I<T>::~I () {}
+template <typename T> I<T>::I (T *x) : p (x) {}
+template <typename T> I<T>::I (const I &x) : p (x.p) {}
+template <typename T> T &I<T>::operator * () { return *p; }
+template <typename T> T *I<T>::operator -> () { return p; }
+template <typename T> T &I<T>::operator [] (const difference_type &x) const { 
return p[x]; }
+template <typename T> I<T> &I<T>::operator = (const I &x) { p = x.p; return 
*this; }
+template <typename T> I<T> &I<T>::operator ++ () { ++p; return *this; }
+template <typename T> I<T> I<T>::operator ++ (int) { return I (p++); }
+template <typename T> I<T> &I<T>::operator -- () { --p; return *this; }
+template <typename T> I<T> I<T>::operator -- (int) { return I (p--); }
+template <typename T> I<T> &I<T>::operator += (const difference_type &x) { p 
+= x; return *this; }
+template <typename T> I<T> &I<T>::operator -= (const difference_type &x) { p 
-= x; return *this; }
+template <typename T> I<T> I<T>::operator + (const difference_type &x) const { 
return I (p + x); }
+template <typename T> I<T> I<T>::operator - (const difference_type &x) const { 
return I (p - x); }
+template <typename T> bool operator == (I<T> &x, I<T> &y) { return x.p == y.p; 
}
+template <typename T> bool operator == (const I<T> &x, const I<T> &y) { return 
x.p == y.p; }
+template <typename T> bool operator != (I<T> &x, I<T> &y) { return !(x == y); }
+template <typename T> bool operator != (const I<T> &x, const I<T> &y) { return 
!(x == y); }
+template <typename T> bool operator < (I<T> &x, I<T> &y) { return x.p < y.p; }
+template <typename T> bool operator < (const I<T> &x, const I<T> &y) { return 
x.p < y.p; }
+template <typename T> bool operator <= (I<T> &x, I<T> &y) { return x.p <= y.p; 
}
+template <typename T> bool operator <= (const I<T> &x, const I<T> &y) { return 
x.p <= y.p; }
+template <typename T> bool operator > (I<T> &x, I<T> &y) { return x.p > y.p; }
+template <typename T> bool operator > (const I<T> &x, const I<T> &y) { return 
x.p > y.p; }
+template <typename T> bool operator >= (I<T> &x, I<T> &y) { return x.p >= y.p; 
}
+template <typename T> bool operator >= (const I<T> &x, const I<T> &y) { return 
x.p >= y.p; }
+template <typename T> typename I<T>::difference_type operator - (I<T> &x, I<T> 
&y) { return x.p - y.p; }
+template <typename T> typename I<T>::difference_type operator - (const I<T> 
&x, const I<T> &y) { return x.p - y.p; }
+template <typename T> I<T> operator + (typename I<T>::difference_type x, const 
I<T> &y) { return I<T> (x + y.p); }
+
+template <typename T>
+class J
+{
+public:
+  J(const I<T> &x, const I<T> &y) : b (x), e (y) {}
+  const I<T> &begin ();
+  const I<T> &end ();
+private:
+  I<T> b, e;
+};
+
+template <typename T> const I<T> &J<T>::begin () { return b; }
+template <typename T> const I<T> &J<T>::end () { return e; }
+
+struct K
+{
+  template <int N> int &get () { if (N == 0) return c; else if (N == 1) return 
b; return a; }
+  int a, b, c;
+};
+
+template <> struct std::tuple_size<K> { static constexpr int value = 3; };
+template <int N> struct std::tuple_element<N, K> { using type = int; };
+
+struct L
+{
+  int a, b, c;
+};
+
+int a[2000];
+long b[40];
+short c[50];
+int d[1024];
+K e[1089];
+L f[1093];
+
+int results[2000];
+
+template <typename T>
+void
+baz (I<T> &i)
+{
+  if (*i < 0 || *i >= 2000)
+    abort ();
+  results[*i]++;
+}
+
+void
+baz (int i)
+{
+  if (i < 0 || i >= 2000)
+    abort ();
+  results[i]++;
+}
+
+void
+f1 ()
+{
+#pragma omp distribute parallel for default(none) shared(a)
+  for (auto i : a)
+    baz (i);
+}
+
+void
+f2 ()
+{
+#pragma omp distribute parallel for default(none) shared(a)
+  for (auto &i : a)
+    if (&i != &a[i])
+      abort ();
+    else
+      baz (i);
+}
+
+void
+f3 ()
+{
+#pragma omp distribute parallel for collapse(3) default(none) shared(b, c)
+  for (auto &i : b)
+    for (int j = 9; j < 10; j++)
+      for (auto k : c)
+       if (&i != &b[i] || i < 0 || i >= 40 || j != 9 || k < 0 || k >= 50)
+         abort ();
+       else
+         baz (i * 50 + k);
+}
+
+void
+f4 (J<int> j)
+{
+#pragma omp distribute parallel for default(none) shared(j, a)
+  for (auto &i : j)
+    if (&i != &a[i])
+      abort ();
+    else
+      baz (i);
+}
+
+void
+f5 ()
+{
+#pragma omp distribute parallel for simd default(none) shared(d, results)
+  for (auto i : d)
+    results[i % 1024] += 2 * ((unsigned) i >> 10) + 1;
+}
+
+void
+f6 (J<K> j)
+{
+#pragma omp distribute parallel for default(none) shared(j, e)
+  for (auto & [k, l, m] : j)
+    if (&k != &e[m].c || &l != &e[m].b || &m != &e[m].a || k != m * 3 || l != 
m * 2)
+      abort ();
+    else
+      baz (m);
+}
+
+void
+f7 (J<L> j)
+{
+#pragma omp distribute parallel for default(none) shared(j, f)
+  for (auto & [k, l, m] : j)
+    if (&k != &f[k].a || &l != &f[k].b || &m != &f[k].c || l != k * 4 || m != 
k * 5)
+      abort ();
+    else
+      baz (k);
+}
+
+void
+f8 (J<K> j)
+{
+#pragma omp distribute parallel for default(none) shared(j)
+  for (auto [k, l, m] : j)
+    if (k != m * 3 || l != m * 2)
+      abort ();
+    else
+      baz (m);
+}
+
+void
+f9 (J<L> j)
+{
+#pragma omp distribute parallel for default(none) shared(j)
+  for (auto [k, l, m] : j)
+    if (l != k * 4 || m != k * 5)
+      abort ();
+    else
+      baz (k);
+}
+
+template <int N>
+void
+f10 ()
+{
+#pragma omp distribute parallel for default(none) shared(a)
+  for (auto i : a)
+    baz (i);
+}
+
+template <int N>
+void
+f11 ()
+{
+#pragma omp distribute parallel for default(none) shared(a)
+  for (auto &i : a)
+    if (&i != &a[i])
+      abort ();
+    else
+      baz (i);
+}
+
+template <int N>
+void
+f12 ()
+{
+#pragma omp distribute parallel for collapse(3) default(none) shared(a, b, c)
+  for (auto &i : b)
+    for (I<int> j = I<int> (&a[9]); j < I<int> (&a[10]); j++)
+      for (auto k : c)
+       if (&i != &b[i] || i < 0 || i >= 40 || *j != 9 || k < 0 || k >= 50)
+         abort ();
+       else
+         baz (i * 50 + k);
+}
+
+template <typename T>
+void
+f13 (J<T> j)
+{
+#pragma omp distribute parallel for default(none) shared(j, a)
+  for (auto &i : j)
+    if (&i != &a[i])
+      abort ();
+    else
+      baz (i);
+}
+
+template <int N>
+void
+f14 ()
+{
+#pragma omp distribute parallel for simd default(none) shared(d, results)
+  for (auto i : d)
+    results[i % N] += 2 * ((unsigned) i >> 10) + 1;
+}
+
+template <typename T>
+void
+f15 (J<T> j)
+{
+#pragma omp distribute parallel for default(none) shared(j, e)
+  for (auto & [k, l, m] : j)
+    if (&k != &e[m].c || &l != &e[m].b || &m != &e[m].a || k != m * 3 || l != 
m * 2)
+      abort ();
+    else
+      baz (m);
+}
+
+template <typename T>
+void
+f16 (J<T> j)
+{
+#pragma omp distribute parallel for default(none) shared(j, f)
+  for (auto & [k, l, m] : j)
+    if (&k != &f[k].a || &l != &f[k].b || &m != &f[k].c || l != k * 4 || m != 
k * 5)
+      abort ();
+    else
+      baz (k);
+}
+
+template <int N>
+void
+f17 (J<K> j)
+{
+#pragma omp distribute parallel for default(none) shared(j)
+  for (auto [k, l, m] : j)
+    if (k != m * 3 || l != m * 2)
+      abort ();
+    else
+      baz (m);
+}
+
+template <int N>
+void
+f18 (J<L> j)
+{
+#pragma omp distribute parallel for default(none) shared(j)
+  for (auto [k, l, m] : j)
+    if (l != k * 4 || m != k * 5)
+      abort ();
+    else
+      baz (k);
+}
+
+#define check(expr) \
+  for (int i = 0; i < 2000; i++)                       \
+    if (expr)                                          \
+      {                                                        \
+       if (results[i] != 1)                            \
+         abort ();                                     \
+       results[i] = 0;                                 \
+      }                                                        \
+    else if (results[i])                               \
+      abort ()
+
+int
+main ()
+{
+  for (int i = 0; i < 2000; i++)
+    a[i] = i;
+  for (int i = 0; i < 40; i++)
+    b[i] = i;
+  for (int i = 0; i < 50; i++)
+    c[i] = i;
+  for (int i = 0; i < 1024; i++)
+    d[i] = i;
+  for (int i = 0; i < 1089; i++)
+    {
+      e[i].a = i;
+      e[i].b = 2 * i;
+      e[i].c = 3 * i;
+    }
+  for (int i = 0; i < 1093; i++)
+    {
+      f[i].a = i;
+      f[i].b = 4 * i;
+      f[i].c = 5 * i;
+    }
+  #pragma omp teams
+  f1 ();
+  check (1);
+  #pragma omp teams
+  f2 ();
+  check (1);
+  #pragma omp teams
+  f3 ();
+  check (1);
+  #pragma omp teams
+  f4 (J<int> (&a[14], &a[1803]));
+  check (i >= 14 && i < 1803);
+  #pragma omp teams
+  f5 ();
+  check (i >= 0 && i < 1024);
+  #pragma omp teams
+  f6 (J<K> (&e[19], &e[1029]));
+  check (i >= 19 && i < 1029);
+  #pragma omp teams
+  f7 (J<L> (&f[15], &f[1091]));
+  check (i >= 15 && i < 1091);
+  #pragma omp teams
+  f8 (J<K> (&e[27], &e[1037]));
+  check (i >= 27 && i < 1037);
+  #pragma omp teams
+  f9 (J<L> (&f[1], &f[1012]));
+  check (i >= 1 && i < 1012);
+  #pragma omp teams
+  f10 <0> ();
+  check (1);
+  #pragma omp teams
+  f11 <1> ();
+  check (1);
+  #pragma omp teams
+  f12 <2> ();
+  check (1);
+  #pragma omp teams
+  f13 (J<int> (&a[24], &a[1703]));
+  check (i >= 24 && i < 1703);
+  #pragma omp teams
+  f14 <1024> ();
+  check (i >= 0 && i < 1024);
+  #pragma omp teams
+  f15 (J<K> (&e[39], &e[929]));
+  check (i >= 39 && i < 929);
+  #pragma omp teams
+  f16 (J<L> (&f[17], &f[1071]));
+  check (i >= 17 && i < 1071);
+  #pragma omp teams
+  f17 <3> (J<K> (&e[7], &e[1017]));
+  check (i >= 7 && i < 1017);
+  #pragma omp teams
+  f18 <5> (J<L> (&f[121], &f[1010]));
+  check (i >= 121 && i < 1010);
+}
--- libgomp/testsuite/libgomp.c-c++-common/for-14.c.jj  2018-07-25 
16:34:48.574289991 +0200
+++ libgomp/testsuite/libgomp.c-c++-common/for-14.c     2018-07-25 
16:35:07.389329335 +0200
@@ -0,0 +1,110 @@
+/* { dg-additional-options "-std=gnu99" { target c } } */
+
+extern
+#ifdef __cplusplus
+"C"
+#endif
+void abort ();
+
+#define M(x, y, z) O(x, y, z)
+#define O(x, y, z) x ## _ ## y ## _ ## z
+
+#define F distribute
+#define G d
+#define S
+#define N(x) M(x, G, normal)
+#include "for-2.h"
+#undef S
+#undef N
+#undef F
+#undef G
+
+#define F distribute
+#define G d_ds128
+#define S dist_schedule(static, 128)
+#define N(x) M(x, G, normal)
+#include "for-2.h"
+#undef S
+#undef N
+#undef F
+#undef G
+
+#define F distribute simd
+#define G ds
+#define S
+#define N(x) M(x, G, normal)
+#include "for-2.h"
+#undef S
+#undef N
+#undef F
+#undef G
+
+#define F distribute simd
+#define G ds_ds128
+#define S dist_schedule(static, 128)
+#define N(x) M(x, G, normal)
+#include "for-2.h"
+#undef S
+#undef N
+#undef F
+#undef G
+
+#define F distribute parallel for
+#define G dpf
+#include "for-1.h"
+#undef F
+#undef G
+
+#define F distribute parallel for dist_schedule(static, 128)
+#define G dpf_ds128
+#include "for-1.h"
+#undef F
+#undef G
+
+#define F distribute parallel for simd
+#define G dpfs
+#include "for-1.h"
+#undef F
+#undef G
+
+#define F distribute parallel for simd dist_schedule(static, 128)
+#define G dpfs_ds128
+#include "for-1.h"
+#undef F
+#undef G
+
+int
+main ()
+{
+  int err = 0;
+  #pragma omp teams reduction(|:err)
+    {
+      err |= test_d_normal ();
+      err |= test_d_ds128_normal ();
+      err |= test_ds_normal ();
+      err |= test_ds_ds128_normal ();
+      err |= test_dpf_static ();
+      err |= test_dpf_static32 ();
+      err |= test_dpf_auto ();
+      err |= test_dpf_guided32 ();
+      err |= test_dpf_runtime ();
+      err |= test_dpf_ds128_static ();
+      err |= test_dpf_ds128_static32 ();
+      err |= test_dpf_ds128_auto ();
+      err |= test_dpf_ds128_guided32 ();
+      err |= test_dpf_ds128_runtime ();
+      err |= test_dpfs_static ();
+      err |= test_dpfs_static32 ();
+      err |= test_dpfs_auto ();
+      err |= test_dpfs_guided32 ();
+      err |= test_dpfs_runtime ();
+      err |= test_dpfs_ds128_static ();
+      err |= test_dpfs_ds128_static32 ();
+      err |= test_dpfs_ds128_auto ();
+      err |= test_dpfs_ds128_guided32 ();
+      err |= test_dpfs_ds128_runtime ();
+    }
+  if (err)
+    abort ();
+  return 0;
+}
--- libgomp/testsuite/libgomp.c-c++-common/for-15.c.jj  2018-07-25 
16:35:28.833374175 +0200
+++ libgomp/testsuite/libgomp.c-c++-common/for-15.c     2018-07-25 
16:38:19.969732022 +0200
@@ -0,0 +1,115 @@
+/* { dg-additional-options "-std=gnu99" { target c } } */
+
+extern
+#ifdef __cplusplus
+"C"
+#endif
+void abort ();
+
+#define M(x, y, z) O(x, y, z)
+#define O(x, y, z) x ## _ ## y ## _ ## z
+
+#define F for
+#define G f
+#define S
+#define N(x) M(x, G, normal)
+#include "for-2.h"
+#undef S
+#undef N
+#undef F
+#undef G
+
+#define F teams distribute
+#define G td
+#define S
+#define N(x) M(x, G, normal)
+#include "for-2.h"
+#undef S
+#undef N
+#undef F
+#undef G
+
+#define F teams distribute
+#define G td_ds128
+#define S dist_schedule(static, 128)
+#define N(x) M(x, G, normal)
+#include "for-2.h"
+#undef S
+#undef N
+#undef F
+#undef G
+
+#define F teams distribute simd
+#define G tds
+#define S
+#define N(x) M(x, G, normal)
+#include "for-2.h"
+#undef S
+#undef N
+#undef F
+#undef G
+
+#define F teams distribute simd
+#define G tds_ds128
+#define S dist_schedule(static, 128)
+#define N(x) M(x, G, normal)
+#include "for-2.h"
+#undef S
+#undef N
+#undef F
+#undef G
+
+#define F teams distribute parallel for
+#define G tdpf
+#include "for-1.h"
+#undef F
+#undef G
+
+#define F teams distribute parallel for dist_schedule(static, 128)
+#define G tdpf_ds128
+#include "for-1.h"
+#undef F
+#undef G
+
+#define F teams distribute parallel for simd
+#define G tdpfs
+#include "for-1.h"
+#undef F
+#undef G
+
+#define F teams distribute parallel for simd dist_schedule(static, 128)
+#define G tdpfs_ds128
+#include "for-1.h"
+#undef F
+#undef G
+
+int
+main ()
+{
+  if (test_td_normal ()
+      || test_td_ds128_normal ()
+      || test_tds_normal ()
+      || test_tds_ds128_normal ()
+      || test_tdpf_static ()
+      || test_tdpf_static32 ()
+      || test_tdpf_auto ()
+      || test_tdpf_guided32 ()
+      || test_tdpf_runtime ()
+      || test_tdpf_ds128_static ()
+      || test_tdpf_ds128_static32 ()
+      || test_tdpf_ds128_auto ()
+      || test_tdpf_ds128_guided32 ()
+      || test_tdpf_ds128_runtime ()
+      || test_tdpfs_static ()
+      || test_tdpfs_static32 ()
+      || test_tdpfs_auto ()
+      || test_tdpfs_guided32 ()
+      || test_tdpfs_runtime ()
+      || test_tdpfs_ds128_static ()
+      || test_tdpfs_ds128_static32 ()
+      || test_tdpfs_ds128_auto ()
+      || test_tdpfs_ds128_guided32 ()
+      || test_tdpfs_ds128_runtime ())
+    abort ();
+  return 0;
+}
--- libgomp/testsuite/libgomp.c-c++-common/pr66199-10.c.jj      2018-07-25 
16:04:13.835398458 +0200
+++ libgomp/testsuite/libgomp.c-c++-common/pr66199-10.c 2018-07-25 
16:04:49.274477458 +0200
@@ -0,0 +1,60 @@
+/* PR middle-end/66199 */
+/* { dg-do run } */
+
+int u[1024], v[1024], w[1024];
+
+__attribute__((noinline, noclone)) long
+f1 (long a, long b)
+{
+  long d;
+  #pragma omp teams distribute parallel for simd default(none) firstprivate 
(a, b) shared(u, v, w)
+  for (d = a; d < b; d++)
+    u[d] = v[d] + w[d];
+  return d;
+}
+
+__attribute__((noinline, noclone)) long
+f2 (long a, long b, long c)
+{
+  long d, e;
+  #pragma omp teams distribute parallel for simd default(none) firstprivate 
(a, b, c) shared(u, v, w) linear(d) lastprivate(e)
+  for (d = a; d < b; d++)
+    {
+      u[d] = v[d] + w[d];
+      e = c + d * 5;
+    }
+  return d + e;
+}
+
+__attribute__((noinline, noclone)) long
+f3 (long a1, long b1, long a2, long b2)
+{
+  long d1, d2;
+  #pragma omp teams distribute parallel for simd default(none) firstprivate 
(a1, b1, a2, b2) shared(u, v, w) lastprivate(d1, d2) collapse(2)
+  for (d1 = a1; d1 < b1; d1++)
+    for (d2 = a2; d2 < b2; d2++)
+      u[d1 * 32 + d2] = v[d1 * 32 + d2] + w[d1 * 32 + d2];
+  return d1 + d2;
+}
+
+__attribute__((noinline, noclone)) long
+f4 (long a1, long b1, long a2, long b2)
+{
+  long d1, d2;
+  #pragma omp teams distribute parallel for simd default(none) firstprivate 
(a1, b1, a2, b2) shared(u, v, w) collapse(2)
+  for (d1 = a1; d1 < b1; d1++)
+    for (d2 = a2; d2 < b2; d2++)
+      u[d1 * 32 + d2] = v[d1 * 32 + d2] + w[d1 * 32 + d2];
+  return d1 + d2;
+}
+
+int
+main ()
+{
+  if (f1 (0, 1024) != 1024
+      || f2 (0, 1024, 17) != 1024 + (17 + 5 * 1023)
+      || f3 (0, 32, 0, 32) != 64
+      || f4 (0, 32, 0, 32) != 64)
+    __builtin_abort ();
+  return 0;
+}
--- libgomp/testsuite/libgomp.c-c++-common/pr66199-11.c.jj      2018-07-25 
16:04:59.035499215 +0200
+++ libgomp/testsuite/libgomp.c-c++-common/pr66199-11.c 2018-07-25 
16:10:46.049242367 +0200
@@ -0,0 +1,38 @@
+/* PR middle-end/66199 */
+/* { dg-do run } */
+/* { dg-options "-O2" { target c } } */
+
+int u[1024], v[1024], w[1024];
+
+__attribute__((noinline, noclone)) long
+f2 (long a, long b, long c)
+{
+  long d, e;
+  #pragma omp teams distribute parallel for default(none) firstprivate (a, b, 
c) shared(u, v, w) lastprivate(d, e)
+  for (d = a; d < b; d++)
+    {
+      u[d] = v[d] + w[d];
+      e = c + d * 5;
+    }
+  return d + e;
+}
+
+__attribute__((noinline, noclone)) long
+f3 (long a1, long b1, long a2, long b2)
+{
+  long d1, d2;
+  #pragma omp teams distribute parallel for default(none) firstprivate (a1, 
b1, a2, b2) shared(u, v, w) lastprivate(d1, d2) collapse(2)
+  for (d1 = a1; d1 < b1; d1++)
+    for (d2 = a2; d2 < b2; d2++)
+      u[d1 * 32 + d2] = v[d1 * 32 + d2] + w[d1 * 32 + d2];
+  return d1 + d2;
+}
+
+int
+main ()
+{
+  if (f2 (0, 1024, 17) != 1024 + (17 + 5 * 1023)
+      || f3 (0, 32, 0, 32) != 64)
+    __builtin_abort ();
+  return 0;
+}
--- libgomp/testsuite/libgomp.c-c++-common/pr66199-12.c.jj      2018-07-25 
16:11:10.200291206 +0200
+++ libgomp/testsuite/libgomp.c-c++-common/pr66199-12.c 2018-07-25 
16:11:23.484318062 +0200
@@ -0,0 +1,60 @@
+/* PR middle-end/66199 */
+/* { dg-do run } */
+
+int u[1024], v[1024], w[1024];
+
+__attribute__((noinline, noclone)) long
+f1 (long a, long b)
+{
+  long d;
+  #pragma omp teams distribute simd default(none) firstprivate (a, b) 
shared(u, v, w)
+  for (d = a; d < b; d++)
+    u[d] = v[d] + w[d];
+  return d;
+}
+
+__attribute__((noinline, noclone)) long
+f2 (long a, long b, long c)
+{
+  long d, e;
+  #pragma omp teams distribute simd default(none) firstprivate (a, b, c) 
shared(u, v, w) linear(d) lastprivate(e)
+  for (d = a; d < b; d++)
+    {
+      u[d] = v[d] + w[d];
+      e = c + d * 5;
+    }
+  return d + e;
+}
+
+__attribute__((noinline, noclone)) long
+f3 (long a1, long b1, long a2, long b2)
+{
+  long d1, d2;
+  #pragma omp teams distribute simd default(none) firstprivate (a1, b1, a2, 
b2) shared(u, v, w) lastprivate(d1, d2) collapse(2)
+  for (d1 = a1; d1 < b1; d1++)
+    for (d2 = a2; d2 < b2; d2++)
+      u[d1 * 32 + d2] = v[d1 * 32 + d2] + w[d1 * 32 + d2];
+  return d1 + d2;
+}
+
+__attribute__((noinline, noclone)) long
+f4 (long a1, long b1, long a2, long b2)
+{
+  long d1, d2;
+  #pragma omp teams distribute simd default(none) firstprivate (a1, b1, a2, 
b2) shared(u, v, w) collapse(2)
+  for (d1 = a1; d1 < b1; d1++)
+    for (d2 = a2; d2 < b2; d2++)
+      u[d1 * 32 + d2] = v[d1 * 32 + d2] + w[d1 * 32 + d2];
+  return d1 + d2;
+}
+
+int
+main ()
+{
+  if (f1 (0, 1024) != 1024
+      || f2 (0, 1024, 17) != 1024 + (17 + 5 * 1023)
+      || f3 (0, 32, 0, 32) != 64
+      || f4 (0, 32, 0, 32) != 64)
+    __builtin_abort ();
+  return 0;
+}
--- libgomp/testsuite/libgomp.c-c++-common/pr66199-13.c.jj      2018-07-25 
16:11:45.326362234 +0200
+++ libgomp/testsuite/libgomp.c-c++-common/pr66199-13.c 2018-07-25 
16:11:59.207390304 +0200
@@ -0,0 +1,64 @@
+/* PR middle-end/66199 */
+/* { dg-do run } */
+
+int u[1024], v[1024], w[1024];
+
+__attribute__((noinline, noclone)) long
+f1 (long a, long b)
+{
+  long d;
+  #pragma omp teams default(none) shared(a, b, d, u, v, w)
+  #pragma omp distribute simd firstprivate (a, b)
+  for (d = a; d < b; d++)
+    u[d] = v[d] + w[d];
+  return d;
+}
+
+__attribute__((noinline, noclone)) long
+f2 (long a, long b, long c)
+{
+  long d, e;
+  #pragma omp teams default(none) firstprivate (a, b, c) shared(d, e, u, v, w)
+  #pragma omp distribute simd linear(d) lastprivate(e)
+  for (d = a; d < b; d++)
+    {
+      u[d] = v[d] + w[d];
+      e = c + d * 5;
+    }
+  return d + e;
+}
+
+__attribute__((noinline, noclone)) long
+f3 (long a1, long b1, long a2, long b2)
+{
+  long d1, d2;
+  #pragma omp teams default(none) shared(a1, b1, a2, b2, d1, d2, u, v, w)
+  #pragma omp distribute simd firstprivate (a1, b1, a2, b2) lastprivate(d1, 
d2) collapse(2)
+  for (d1 = a1; d1 < b1; d1++)
+    for (d2 = a2; d2 < b2; d2++)
+      u[d1 * 32 + d2] = v[d1 * 32 + d2] + w[d1 * 32 + d2];
+  return d1 + d2;
+}
+
+__attribute__((noinline, noclone)) long
+f4 (long a1, long b1, long a2, long b2)
+{
+  long d1, d2;
+  #pragma omp teams default(none) firstprivate (a1, b1, a2, b2) shared(d1, d2, 
u, v, w)
+  #pragma omp distribute simd collapse(2)
+  for (d1 = a1; d1 < b1; d1++)
+    for (d2 = a2; d2 < b2; d2++)
+      u[d1 * 32 + d2] = v[d1 * 32 + d2] + w[d1 * 32 + d2];
+  return d1 + d2;
+}
+
+int
+main ()
+{
+  if (f1 (0, 1024) != 1024
+      || f2 (0, 1024, 17) != 1024 + (17 + 5 * 1023)
+      || f3 (0, 32, 0, 32) != 64
+      || f4 (0, 32, 0, 32) != 64)
+    __builtin_abort ();
+  return 0;
+}
--- libgomp/testsuite/libgomp.c-c++-common/pr66199-14.c.jj      2018-07-25 
16:12:25.694444097 +0200
+++ libgomp/testsuite/libgomp.c-c++-common/pr66199-14.c 2018-07-25 
16:12:33.965461624 +0200
@@ -0,0 +1,39 @@
+/* PR middle-end/66199 */
+/* { dg-do run } */
+
+int u[1024], v[1024], w[1024];
+
+__attribute__((noinline, noclone)) long
+f2 (long a, long b, long c)
+{
+  long d, e;
+  #pragma omp teams default(none) firstprivate (a, b, c) shared(d, e, u, v, w)
+  #pragma omp distribute lastprivate(d, e)
+  for (d = a; d < b; d++)
+    {
+      u[d] = v[d] + w[d];
+      e = c + d * 5;
+    }
+  return d + e;
+}
+
+__attribute__((noinline, noclone)) long
+f3 (long a1, long b1, long a2, long b2)
+{
+  long d1, d2;
+  #pragma omp teams default(none) shared(a1, b1, a2, b2, d1, d2, u, v, w)
+  #pragma omp distribute firstprivate (a1, b1, a2, b2) lastprivate(d1, d2) 
collapse(2)
+  for (d1 = a1; d1 < b1; d1++)
+    for (d2 = a2; d2 < b2; d2++)
+      u[d1 * 32 + d2] = v[d1 * 32 + d2] + w[d1 * 32 + d2];
+  return d1 + d2;
+}
+
+int
+main ()
+{
+  if (f2 (0, 1024, 17) != 1024 + (17 + 5 * 1023)
+      || f3 (0, 32, 0, 32) != 64)
+    __builtin_abort ();
+  return 0;
+}
--- libgomp/testsuite/libgomp.c/teams-1.c.jj    2018-07-25 16:48:10.261975371 
+0200
+++ libgomp/testsuite/libgomp.c/teams-1.c       2018-07-25 16:50:16.463244588 
+0200
@@ -0,0 +1,27 @@
+#include <omp.h>
+#include <stdlib.h>
+
+int
+main ()
+{
+  #pragma omp teams thread_limit (2)
+  {
+    if (omp_in_parallel ()
+       || omp_get_level () != 0
+       || omp_get_ancestor_thread_num (0) != 0
+       || omp_get_ancestor_thread_num (1) != -1)
+      abort ();
+    omp_set_dynamic (0);
+    omp_set_nested (1);
+    #pragma omp parallel num_threads (2)
+    {
+      if (!omp_in_parallel ()
+         || omp_get_level () != 1
+         || omp_get_ancestor_thread_num (0) != 0
+         || omp_get_ancestor_thread_num (1) != omp_get_thread_num ()
+         || omp_get_ancestor_thread_num (2) != -1)
+       abort ();
+    }
+  }
+  return 0;
+}
--- libgomp/testsuite/libgomp.c/teams-2.c.jj    2018-07-25 16:51:33.034407935 
+0200
+++ libgomp/testsuite/libgomp.c/teams-2.c       2018-07-25 17:00:33.718636815 
+0200
@@ -0,0 +1,123 @@
+#include <omp.h>
+#include <stdlib.h>
+
+__attribute__((noinline))
+void
+foo (int x, int y, int z, int *a, int *b)
+{
+  if (x == 0)
+    {
+      int i, j;
+      for (i = 0; i < 64; i++)
+       #pragma omp parallel for shared (a, b)
+       for (j = 0; j < 32; j++)
+         foo (3, i, j, a, b);
+    }
+  else if (x == 1)
+    {
+      int i, j;
+      #pragma omp distribute dist_schedule (static, 1)
+      for (i = 0; i < 64; i++)
+       #pragma omp parallel for shared (a, b)
+       for (j = 0; j < 32; j++)
+         foo (3, i, j, a, b);
+    }
+  else if (x == 2)
+    {
+      int j;
+      #pragma omp parallel for shared (a, b)
+      for (j = 0; j < 32; j++)
+       foo (3, y, j, a, b);
+    }
+  else
+    {
+      #pragma omp atomic
+      b[y] += z;
+      #pragma omp atomic
+      *a += 1;
+    }
+}
+
+__attribute__((noinline))
+int
+bar (int x, int y, int z)
+{
+  int a, b[64], i;
+  a = 8;
+  for (i = 0; i < 64; i++)
+    b[i] = i;
+  foo (x, y, z, &a, b);
+  if (x == 0)
+    {
+      if (a != 8 + 64 * 32)
+       return 1;
+      for (i = 0; i < 64; i++)
+       if (b[i] != i + 31 * 32 / 2)
+         return 1;
+    }
+  else if (x == 1)
+    {
+      int c = omp_get_num_teams ();
+      int d = omp_get_team_num ();
+      int e = d;
+      int f = 0;
+      for (i = 0; i < 64; i++)
+       if (i == e)
+         {
+           if (b[i] != i + 31 * 32 / 2)
+             return 1;
+           f++;
+           e = e + c;
+         }
+       else if (b[i] != i)
+         return 1;
+      if (a < 8 || a > 8 + f * 32)
+       return 1;
+    }
+  else if (x == 2)
+    {
+      if (a != 8 + 32)
+       return 1;
+      for (i = 0; i < 64; i++)
+       if (b[i] != i + (i == y ? 31 * 32 / 2 : 0))
+         return 1;
+    }
+  else if (x == 3)
+    {
+      if (a != 8 + 1)
+       return 1;
+      for (i = 0; i < 64; i++)
+       if (b[i] != i + (i == y ? z : 0))
+         return 1;
+    }
+  return 0;
+}
+
+int
+main ()
+{
+  int i, j, err = 0;
+  #pragma omp teams reduction(+:err)
+  err += bar (0, 0, 0);
+  if (err)
+    abort ();
+  #pragma omp teams reduction(+:err)
+  err += bar (1, 0, 0);
+  if (err)
+    abort ();
+  #pragma omp teams reduction(+:err)
+  #pragma omp distribute
+  for (i = 0; i < 64; i++)
+    err += bar (2, i, 0);
+  if (err)
+    abort ();
+  #pragma omp teams reduction(+:err)
+  #pragma omp distribute
+  for (i = 0; i < 64; i++)
+  #pragma omp parallel for reduction(+:err)
+    for (j = 0; j < 32; j++)
+      err += bar (3, i, j);
+  if (err)
+    abort ();
+  return 0;
+}
--- libgomp/testsuite/libgomp.c/thread-limit-5.c.jj     2018-07-25 
16:46:32.265767205 +0200
+++ libgomp/testsuite/libgomp.c/thread-limit-5.c        2018-07-25 
16:46:48.664801688 +0200
@@ -0,0 +1,11 @@
+#include <stdlib.h>
+#include <omp.h>
+
+int
+main ()
+{
+  #pragma omp teams thread_limit (1)
+  if (omp_get_thread_limit () != 1)
+    abort ();
+  return 0;
+}
--- libgomp/testsuite/libgomp.c/thread-limit-4.c.jj     2018-07-25 
16:44:40.048531236 +0200
+++ libgomp/testsuite/libgomp.c/thread-limit-4.c        2018-07-25 
16:46:22.398746457 +0200
@@ -0,0 +1,57 @@
+/* { dg-do run } */
+/* { dg-set-target-env-var OMP_THREAD_LIMIT "9" } */
+
+#include <stdlib.h>
+#include <unistd.h>
+#include <omp.h>
+
+int
+main ()
+{
+  if (omp_get_thread_limit () != 9)
+    return 0;
+  omp_set_dynamic (0);
+  #pragma omp parallel num_threads (8)
+  if (omp_get_num_threads () != 8)
+    abort ();
+  #pragma omp parallel num_threads (16)
+  if (omp_get_num_threads () > 9)
+    abort ();
+  #pragma omp teams thread_limit (6)
+  {
+    if (omp_get_thread_limit () > 6)
+      abort ();
+    if (omp_get_thread_limit () == 6)
+      {
+       omp_set_dynamic (0);
+       omp_set_nested (1);
+       #pragma omp parallel num_threads (3)
+       if (omp_get_num_threads () != 3)
+         abort ();
+       #pragma omp parallel num_threads (3)
+       if (omp_get_num_threads () != 3)
+         abort ();
+       #pragma omp parallel num_threads (8)
+       if (omp_get_num_threads () > 6)
+         abort ();
+       #pragma omp parallel num_threads (6)
+       if (omp_get_num_threads () != 6)
+         abort ();
+       int cnt = 0;
+       #pragma omp parallel num_threads (5)
+       #pragma omp parallel num_threads (5)
+       #pragma omp parallel num_threads (2)
+       {
+         int v;
+         #pragma omp atomic capture
+         v = ++cnt;
+         if (v > 6)
+           abort ();
+         usleep (10000);
+         #pragma omp atomic
+         --cnt;
+       }
+      }
+  }
+  return 0;
+}
--- libgomp/Makefile.in.jj      2018-05-21 17:21:39.227956241 +0200
+++ libgomp/Makefile.in 2018-07-25 15:42:43.580776141 +0200
@@ -181,7 +181,7 @@ am_libgomp_la_OBJECTS = alloc.lo atomic.
        target.lo splay-tree.lo libgomp-plugin.lo oacc-parallel.lo \
        oacc-host.lo oacc-init.lo oacc-mem.lo oacc-async.lo \
        oacc-plugin.lo oacc-cuda.lo priority_queue.lo affinity-fmt.lo \
-       $(am__objects_1)
+       teams.lo $(am__objects_1)
 libgomp_la_OBJECTS = $(am_libgomp_la_OBJECTS)
 DEFAULT_INCLUDES = -I.@am__isrc@
 depcomp = $(SHELL) $(top_srcdir)/../depcomp
@@ -437,7 +437,7 @@ libgomp_la_SOURCES = alloc.c atomic.c ba
        affinity.c target.c splay-tree.c libgomp-plugin.c \
        oacc-parallel.c oacc-host.c oacc-init.c oacc-mem.c \
        oacc-async.c oacc-plugin.c oacc-cuda.c priority_queue.c \
-       affinity-fmt.c $(am__append_3)
+       affinity-fmt.c teams.c $(am__append_3)
 
 # Nvidia PTX OpenACC plugin.
 @PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_version_info = -version-info 
$(libtool_VERSION)
@@ -640,6 +640,7 @@ distclean-compile:
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/target.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/task.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/team.Plo@am__quote@
+@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/teams.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/time.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/work.Plo@am__quote@
 

        Jakub

[gomp5] Add host teams construct support

Reply via email to