Hello

This is a WIP implementation of the OpenMP 5.0 task detach clause. The task construct can now take a detach clause, passing in a variable of type omp_event_handle_t. When the construct is encountered, space for an event is allocated and the event variable is set to point to the new event. When the task is run, it is not complete until a new function omp_fulfill_event has been called on the event variable, either in the task itself or in another thread of execution.

lower_detach_clause generates code to call GOMP_new_event, which allocates, initializes and returns a pointer to a gomp_allow_completion_event struct. The return value is then type-cast to a omp_event_handle_t and assigned to the event variable, before the data environment for the task construct is set up.

The event variable is passed into the call to GOMP_task, where it is assigned to a field in the gomp_task struct. If the task is not deferred, then it will wait for the detach event for be fulfilled inside GOMP_task, otherwise it needs to be handled in omp_barrier_handle_tasks.

When a task finishes in omp_barrier_handle_tasks and the detach event has not been fulfilled, it is placed onto a separate queue of unfulfilled tasks before the current thread continues with another task. When the current thread has no more tasks, then it will remove a task from the queue of unfulfilled tasks and wait for it to complete. When it does, it is removed and any dependent tasks are requeued for execution.

We cannot simply block after a task with an unfulfilled event has finished because in the case where there are more tasks than threads, there is the possibility that all the threads will be tied up waiting, while a task that results in an event getting fulfilled never gets run, causing execution to stall.

The memory allocated for the event is released when the associated task is destroyed.

Issues that I can see with the current implementation at the moment are:

- No error checking at the front-end.
- The memory for the event is not mapped on the target. This means that if omp_fulfill_event is called from an 'omp target' section with a target that does not share memory with the host, the event will not be fulfilled (and a segfault will probably occur). - The tasks awaiting event fulfillment currently wait until there are no other runnable tasks left. A better approach would be to poll (without blocking) the waiting tasks whenever any task completes, immediately removing any now-complete tasks and requeuing any dependent tasks.

This patchset has only been very lightly tested on a x86-64 host. Any comments/thoughts/suggestions on this implementation?

Thanks

Kwok
commit 4c3926d9abb1a7e6089a9098e2099e2d574ebfec
Author: Kwok Cheung Yeung <k...@codesourcery.com>
Date:   Tue Nov 3 03:06:26 2020 -0800

    openmp: Add support for the OpenMP 5.0 task detach clause
    
    2020-11-11  Kwok Cheung Yeung  <k...@codesourcery.com>
    
        gcc/
        * builtin-types.def (BT_PTR_SIZED_INT): New primitive type.
        (BT_FN_PSINT_VOID): New function type.
        (BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR_INT): Rename
        to...
        (BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR_INT_PSINT):
        ...this.  Add extra argument.
        * gimplify.c (gimplify_scan_omp_clauses): Handle OMP_CLAUSE_DETACH.
        (gimplify_adjust_omp_clauses): Likewise.
        * omp-builtins.def (BUILT_IN_GOMP_TASK): Change function type to
        BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR_INT_PSINT.
        (BUILT_IN_GOMP_NEW_EVENT): New.
        * omp-expand.c (expand_task_call): Add detach argument when generating
        call to GOMP_task.
        * omp-low.c (scan_sharing_clauses): Setup data environment for detach
        clause.
        (lower_detach_clause): New.
        (lower_omp_taskreg): Call lower_detach_clause for detach clause.  Add
        Gimple statements generated for detach clause.
        * tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_DETACH.
        * tree-pretty-print.c (dump_omp_clause): Handle OMP_CLAUSE_DETACH.
        * tree.c (omp_clause_num_ops): Add entry for OMP_CLAUSE_DETACH.
        (omp_clause_code_name): Add entry for OMP_CLAUSE_DETACH.
        (walk_tree_1): Handle OMP_CLAUSE_DETACH.
        * tree.h (OMP_CLAUSE_DETACH_EXPR): New.
    
        gcc/c-family/
        * c-pragma.h (pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_DETACH.
        Redefine PRAGMA_OACC_CLAUSE_DETACH.
    
        gcc/c/
        * c-parser.c (c_parser_omp_clause_detach): New.
        (c_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_DETACH clause.
        (OMP_TASK_CLAUSE_MASK): Add mask for PRAGMA_OMP_CLAUSE_DETACH.
        * c-typeck.c (c_finish_omp_clauses): Handle PRAGMA_OMP_CLAUSE_DETACH
        clause.
    
        gcc/cp/
        * parser.c (cp_parser_omp_all_clauses): Handle
        PRAGMA_OMP_CLAUSE_DETACH.
        (OMP_TASK_CLAUSE_MASK): Add mask for PRAGMA_OMP_CLAUSE_DETACH.
        * semantics.c (finish_omp_clauses): Handle OMP_CLAUSE_DETACH clause.
    
        gcc/fortran/
        * dump-parse-tree.c (show_omp_clauses): Handle detach clause.
        * frontend-passes.c (gfc_code_walker): Walk detach expression.
        * gfortran.h (struct gfc_omp_clauses): Add detach field.
        * openmp.c (gfc_free_omp_clauses): Free detach clause.
        (enum omp_mask1): Add OMP_CLAUSE_DETACH.
        (enum omp_mask2): Remove OMP_CLAUSE_DETACH.
        (gfc_match_omp_clauses): Handle OMP_CLAUSE_DETACH for OpenMP.
        (OMP_TASK_CLAUSES): Add OMP_CLAUSE_DETACH.
        * trans-openmp.c (gfc_trans_omp_clauses): Handle detach clause.
        * types.def (BT_PTR_SIZED_INT): New type.
        (BT_FN_PSINT_VOID): New function type.
        (BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR_INT): Rename
        to...
        (BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR_INT_PSINT):
        ...this.  Add extra argument.
    
        libgomp/
        * fortran.c (omp_fulfill_event_): New.
        * libgomp.h (struct gomp_allow_completion_event): New.
        (struct gomp_task): Add detach_event field.
        (struct gomp_team): Add task_detach_queue and task_detach_count
        fields.
        (gomp_finish_task): Delete detach_event.
        * libgomp.map (OMP_5.0.1): Add omp_fulfill_event and omp_fulfill_event_.
        (GOMP_5.0): Add GOMP_new_event.
        * libgomp_g.h (GOMP_new_event): New.
        (GOMP_task): Add uintptr_t argument.
        * omp.h.in (enum omp_event_handle_t): New.
        (omp_fulfill_event): New.
        * omp_lib.f90.in (omp_event_handle_kind): New.
        (omp_fulfill_event): New.
        * omp_lib.h.in (omp_event_handle_kind): New.
        (omp_event_handle_kind): New.
        (omp_fulfill_event): Declare.
        * task.c (gomp_init_task): Initialize detach_event field.
        (GOMP_new_event): New.
        (GOMP_task): Add detach argument.  Initialize detach_event field.
        Wait for detach event if task not deferred.
        (gomp_barrier_handle_tasks): Queue tasks with unfulfilled events.
        When idle, wait for events to be fulfilled, remove completed tasks
        and requeue dependent tasks.
        (omp_fulfill_event): New.
        * team.c (gomp_new_team): Initialize task_detach_queue and
        task_detach_count fields.
        (free_team): Free task_detach_queue field.
        * testsuite/libgomp.c-c++-common/task-detach-1.c: New testcase.
        * testsuite/libgomp.c-c++-common/task-detach-2.c: New testcase.
        * testsuite/libgomp.c-c++-common/task-detach-3.c: New testcase.
        * testsuite/libgomp.fortran/task-detach-1.f90: New testcase.
        * testsuite/libgomp.fortran/task-detach-2.f90: New testcase.
        * testsuite/libgomp.fortran/task-detach-3.f90: New testcase.

diff --git a/gcc/builtin-types.def b/gcc/builtin-types.def
index c46b1bc..cb2d5c2 100644
--- a/gcc/builtin-types.def
+++ b/gcc/builtin-types.def
@@ -79,6 +79,7 @@ DEF_PRIMITIVE_TYPE (BT_UINT128, uint128_type_node
 DEF_PRIMITIVE_TYPE (BT_WORD, (*lang_hooks.types.type_for_mode) (word_mode, 1))
 DEF_PRIMITIVE_TYPE (BT_UNWINDWORD, (*lang_hooks.types.type_for_mode)
                                    (targetm.unwind_word_mode (), 1))
+DEF_PRIMITIVE_TYPE (BT_PTR_SIZED_INT, pointer_sized_int_node)
 DEF_PRIMITIVE_TYPE (BT_FLOAT, float_type_node)
 DEF_PRIMITIVE_TYPE (BT_DOUBLE, double_type_node)
 DEF_PRIMITIVE_TYPE (BT_LONGDOUBLE, long_double_type_node)
@@ -253,6 +254,7 @@ DEF_FUNCTION_TYPE_1 (BT_FN_LONG_LONGDOUBLE, BT_LONG, 
BT_LONGDOUBLE)
 DEF_FUNCTION_TYPE_1 (BT_FN_LONGLONG_FLOAT, BT_LONGLONG, BT_FLOAT)
 DEF_FUNCTION_TYPE_1 (BT_FN_LONGLONG_DOUBLE, BT_LONGLONG, BT_DOUBLE)
 DEF_FUNCTION_TYPE_1 (BT_FN_LONGLONG_LONGDOUBLE, BT_LONGLONG, BT_LONGDOUBLE)
+DEF_FUNCTION_TYPE_1 (BT_FN_PSINT_VOID, BT_PTR_SIZED_INT, BT_VOID)
 DEF_FUNCTION_TYPE_1 (BT_FN_VOID_PTR, BT_VOID, BT_PTR)
 DEF_FUNCTION_TYPE_1 (BT_FN_SIZE_CONST_STRING, BT_SIZE, BT_CONST_STRING)
 DEF_FUNCTION_TYPE_1 (BT_FN_INT_CONST_STRING, BT_INT, BT_CONST_STRING)
@@ -754,10 +756,6 @@ DEF_FUNCTION_TYPE_8 
(BT_FN_BOOL_UINT_ULLPTR_LONG_ULL_ULLPTR_ULLPTR_PTR_PTR,
                     BT_BOOL, BT_UINT, BT_PTR_ULONGLONG, BT_LONG, BT_ULONGLONG,
                     BT_PTR_ULONGLONG, BT_PTR_ULONGLONG, BT_PTR, BT_PTR)
 
-DEF_FUNCTION_TYPE_9 (BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR_INT,
-                    BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR,
-                    BT_PTR_FN_VOID_PTR_PTR, BT_LONG, BT_LONG,
-                    BT_BOOL, BT_UINT, BT_PTR, BT_INT)
 DEF_FUNCTION_TYPE_9 (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_PTR,
                     BT_VOID, BT_INT, BT_PTR_FN_VOID_PTR, BT_SIZE, BT_PTR,
                     BT_PTR, BT_PTR, BT_UINT, BT_PTR, BT_PTR)
@@ -765,6 +763,10 @@ DEF_FUNCTION_TYPE_9 
(BT_FN_BOOL_LONG_LONG_LONG_LONG_LONG_LONGPTR_LONGPTR_PTR_PTR
                     BT_BOOL, BT_LONG, BT_LONG, BT_LONG, BT_LONG, BT_LONG,
                     BT_PTR_LONG, BT_PTR_LONG, BT_PTR, BT_PTR)
 
+DEF_FUNCTION_TYPE_10 
(BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR_INT_PSINT,
+                     BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR,
+                     BT_PTR_FN_VOID_PTR_PTR, BT_LONG, BT_LONG,
+                     BT_BOOL, BT_UINT, BT_PTR, BT_INT, BT_PTR_SIZED_INT)
 DEF_FUNCTION_TYPE_10 
(BT_FN_BOOL_BOOL_ULL_ULL_ULL_LONG_ULL_ULLPTR_ULLPTR_PTR_PTR,
                      BT_BOOL, BT_BOOL, BT_ULONGLONG, BT_ULONGLONG,
                      BT_ULONGLONG, BT_LONG, BT_ULONGLONG, BT_PTR_ULONGLONG,
diff --git a/gcc/c-family/c-pragma.h b/gcc/c-family/c-pragma.h
index 5a493fe..fb784e9 100644
--- a/gcc/c-family/c-pragma.h
+++ b/gcc/c-family/c-pragma.h
@@ -94,6 +94,7 @@ enum pragma_omp_clause {
   PRAGMA_OMP_CLAUSE_DEFAULT,
   PRAGMA_OMP_CLAUSE_DEFAULTMAP,
   PRAGMA_OMP_CLAUSE_DEPEND,
+  PRAGMA_OMP_CLAUSE_DETACH,
   PRAGMA_OMP_CLAUSE_DEVICE,
   PRAGMA_OMP_CLAUSE_DEVICE_TYPE,
   PRAGMA_OMP_CLAUSE_DIST_SCHEDULE,
@@ -150,7 +151,6 @@ enum pragma_omp_clause {
   PRAGMA_OACC_CLAUSE_COPYOUT,
   PRAGMA_OACC_CLAUSE_CREATE,
   PRAGMA_OACC_CLAUSE_DELETE,
-  PRAGMA_OACC_CLAUSE_DETACH,
   PRAGMA_OACC_CLAUSE_DEVICEPTR,
   PRAGMA_OACC_CLAUSE_DEVICE_RESIDENT,
   PRAGMA_OACC_CLAUSE_FINALIZE,
@@ -173,6 +173,7 @@ enum pragma_omp_clause {
   PRAGMA_OACC_CLAUSE_COPYIN = PRAGMA_OMP_CLAUSE_COPYIN,
   PRAGMA_OACC_CLAUSE_DEVICE = PRAGMA_OMP_CLAUSE_DEVICE,
   PRAGMA_OACC_CLAUSE_DEFAULT = PRAGMA_OMP_CLAUSE_DEFAULT,
+  PRAGMA_OACC_CLAUSE_DETACH = PRAGMA_OMP_CLAUSE_DETACH,
   PRAGMA_OACC_CLAUSE_FIRSTPRIVATE = PRAGMA_OMP_CLAUSE_FIRSTPRIVATE,
   PRAGMA_OACC_CLAUSE_IF = PRAGMA_OMP_CLAUSE_IF,
   PRAGMA_OACC_CLAUSE_PRIVATE = PRAGMA_OMP_CLAUSE_PRIVATE,
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index ecc3d21..3b766fd 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -13407,6 +13407,15 @@ c_parser_omp_clause_default (c_parser *parser, tree 
list, bool is_oacc)
   return c;
 }
 
+/* OpenMP 5.0:
+   detach ( event-handle ) */
+
+static tree
+c_parser_omp_clause_detach (c_parser *parser, tree list)
+{
+  return c_parser_omp_var_list_parens (parser, OMP_CLAUSE_DETACH, list);
+}
+
 /* OpenMP 2.5:
    firstprivate ( variable-list ) */
 
@@ -16244,6 +16253,10 @@ c_parser_omp_all_clauses (c_parser *parser, 
omp_clause_mask mask,
          clauses = c_parser_omp_clause_default (parser, clauses, false);
          c_name = "default";
          break;
+       case PRAGMA_OMP_CLAUSE_DETACH:
+         clauses = c_parser_omp_clause_detach (parser, clauses);
+         c_name = "detach";
+         break;
        case PRAGMA_OMP_CLAUSE_FIRSTPRIVATE:
          clauses = c_parser_omp_clause_firstprivate (parser, clauses);
          c_name = "firstprivate";
@@ -19142,7 +19155,8 @@ c_parser_omp_single (location_t loc, c_parser *parser, 
bool *if_p)
        | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_DEPEND)       \
        | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_PRIORITY)     \
        | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_ALLOCATE)     \
-       | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_IN_REDUCTION))
+       | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_IN_REDUCTION) \
+       | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_DETACH))
 
 static tree
 c_parser_omp_task (location_t loc, c_parser *parser, bool *if_p)
diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index 9684037..9a5018c 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -14937,6 +14937,11 @@ c_finish_omp_clauses (tree clauses, enum 
c_omp_region_type ort)
          pc = &OMP_CLAUSE_CHAIN (c);
          continue;
 
+       case OMP_CLAUSE_DETACH:
+         t = OMP_CLAUSE_DECL (c);
+         pc = &OMP_CLAUSE_CHAIN (c);
+         continue;
+
        case OMP_CLAUSE_IF:
        case OMP_CLAUSE_NUM_THREADS:
        case OMP_CLAUSE_NUM_TEAMS:
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index bbf157e..92457929 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -37847,6 +37847,10 @@ cp_parser_omp_all_clauses (cp_parser *parser, 
omp_clause_mask mask,
                                                 token->location);
          c_name = "depend";
          break;
+       case PRAGMA_OMP_CLAUSE_DETACH:
+         clauses = cp_parser_omp_var_list (parser, OMP_CLAUSE_DETACH, clauses);
+         c_name = "detach";
+         break;
        case PRAGMA_OMP_CLAUSE_MAP:
          clauses = cp_parser_omp_clause_map (parser, clauses);
          c_name = "map";
@@ -40381,7 +40385,8 @@ cp_parser_omp_single (cp_parser *parser, cp_token 
*pragma_tok, bool *if_p)
        | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_DEPEND)       \
        | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_PRIORITY)     \
        | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_ALLOCATE)     \
-       | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_IN_REDUCTION))
+       | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_IN_REDUCTION) \
+       | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_DETACH))
 
 static tree
 cp_parser_omp_task (cp_parser *parser, cp_token *pragma_tok, bool *if_p)
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index a550db6..6972d22 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -7391,6 +7391,9 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type 
ort)
                }
            }
          break;
+       case OMP_CLAUSE_DETACH:
+         t = OMP_CLAUSE_DECL (c);
+         break;
 
        case OMP_CLAUSE_MAP:
        case OMP_CLAUSE_TO:
diff --git a/gcc/fortran/dump-parse-tree.c b/gcc/fortran/dump-parse-tree.c
index 43b97ba..b28fe73 100644
--- a/gcc/fortran/dump-parse-tree.c
+++ b/gcc/fortran/dump-parse-tree.c
@@ -1692,6 +1692,12 @@ show_omp_clauses (gfc_omp_clauses *omp_clauses)
       show_expr (omp_clauses->priority);
       fputc (')', dumpfile);
     }
+  if (omp_clauses->detach)
+    {
+      fputs (" DETACH(", dumpfile);
+      show_expr (omp_clauses->detach);
+      fputc (')', dumpfile);
+    }
   for (i = 0; i < OMP_IF_LAST; i++)
     if (omp_clauses->if_exprs[i])
       {
diff --git a/gcc/fortran/frontend-passes.c b/gcc/fortran/frontend-passes.c
index 83f6fd8..699b354 100644
--- a/gcc/fortran/frontend-passes.c
+++ b/gcc/fortran/frontend-passes.c
@@ -5597,6 +5597,7 @@ gfc_code_walker (gfc_code **c, walk_code_fn_t codefn, 
walk_expr_fn_t exprfn,
                  WALK_SUBEXPR (co->ext.omp_clauses->hint);
                  WALK_SUBEXPR (co->ext.omp_clauses->num_tasks);
                  WALK_SUBEXPR (co->ext.omp_clauses->priority);
+                 WALK_SUBEXPR (co->ext.omp_clauses->detach);
                  for (idx = 0; idx < OMP_IF_LAST; idx++)
                    WALK_SUBEXPR (co->ext.omp_clauses->if_exprs[idx]);
                  for (idx = 0;
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index dfd7796..a2193dc 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -1410,6 +1410,7 @@ typedef struct gfc_omp_clauses
   struct gfc_expr *hint;
   struct gfc_expr *num_tasks;
   struct gfc_expr *priority;
+  struct gfc_expr *detach;
   struct gfc_expr *if_exprs[OMP_IF_LAST];
   enum gfc_omp_sched_kind dist_sched_kind;
   struct gfc_expr *dist_chunk_size;
diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index 2270c85..c361859 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -91,6 +91,7 @@ gfc_free_omp_clauses (gfc_omp_clauses *c)
   gfc_free_expr (c->hint);
   gfc_free_expr (c->num_tasks);
   gfc_free_expr (c->priority);
+  gfc_free_expr (c->detach);
   for (i = 0; i < OMP_IF_LAST; i++)
     gfc_free_expr (c->if_exprs[i]);
   gfc_free_expr (c->async_expr);
@@ -805,6 +806,7 @@ enum omp_mask1
   OMP_CLAUSE_ATOMIC,  /* OpenMP 5.0.  */
   OMP_CLAUSE_CAPTURE,  /* OpenMP 5.0.  */
   OMP_CLAUSE_MEMORDER,  /* OpenMP 5.0.  */
+  OMP_CLAUSE_DETACH,  /* OpenMP 5.0.  */
   OMP_CLAUSE_NOWAIT,
   /* This must come last.  */
   OMP_MASK1_LAST
@@ -838,7 +840,6 @@ enum omp_mask2
   OMP_CLAUSE_IF_PRESENT,
   OMP_CLAUSE_FINALIZE,
   OMP_CLAUSE_ATTACH,
-  OMP_CLAUSE_DETACH,
   /* This must come last.  */
   OMP_MASK2_LAST
 };
@@ -1219,6 +1220,12 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const 
omp_mask mask,
                gfc_current_locus = old_loc;
            }
          if ((mask & OMP_CLAUSE_DETACH)
+             && !openacc
+             && c->detach == NULL
+             && gfc_match ("detach ( %e )", &c->detach) == MATCH_YES)
+           continue;
+         if ((mask & OMP_CLAUSE_DETACH)
+             && openacc
              && gfc_match ("detach ( ") == MATCH_YES
              && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
                                           OMP_MAP_DETACH, false,
@@ -2696,7 +2703,7 @@ cleanup:
   (omp_mask (OMP_CLAUSE_PRIVATE) | OMP_CLAUSE_FIRSTPRIVATE             \
    | OMP_CLAUSE_SHARED | OMP_CLAUSE_IF | OMP_CLAUSE_DEFAULT            \
    | OMP_CLAUSE_UNTIED | OMP_CLAUSE_FINAL | OMP_CLAUSE_MERGEABLE       \
-   | OMP_CLAUSE_DEPEND | OMP_CLAUSE_PRIORITY)
+   | OMP_CLAUSE_DEPEND | OMP_CLAUSE_PRIORITY | OMP_CLAUSE_DETACH)
 #define OMP_TASKLOOP_CLAUSES \
   (omp_mask (OMP_CLAUSE_PRIVATE) | OMP_CLAUSE_FIRSTPRIVATE             \
    | OMP_CLAUSE_LASTPRIVATE | OMP_CLAUSE_SHARED | OMP_CLAUSE_IF                
\
diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c
index 1d652a0..9182482 100644
--- a/gcc/fortran/trans-openmp.c
+++ b/gcc/fortran/trans-openmp.c
@@ -3639,6 +3639,21 @@ gfc_trans_omp_clauses (stmtblock_t *block, 
gfc_omp_clauses *clauses,
       omp_clauses = gfc_trans_add_clause (c, omp_clauses);
     }
 
+  if (clauses->detach)
+    {
+      tree detach;
+
+      gfc_init_se (&se, NULL);
+      gfc_conv_expr (&se, clauses->detach);
+      gfc_add_block_to_block (block, &se.pre);
+      detach = se.expr;
+      gfc_add_block_to_block (block, &se.post);
+
+      c = build_omp_clause (gfc_get_location (&where), OMP_CLAUSE_DETACH);
+      OMP_CLAUSE_DETACH_EXPR (c) = detach;
+      omp_clauses = gfc_trans_add_clause (c, omp_clauses);
+    }
+
   if (clauses->hint)
     {
       tree hint;
diff --git a/gcc/fortran/types.def b/gcc/fortran/types.def
index 7b4925c..36543d2 100644
--- a/gcc/fortran/types.def
+++ b/gcc/fortran/types.def
@@ -53,6 +53,7 @@ DEF_PRIMITIVE_TYPE (BT_LONG, long_integer_type_node)
 DEF_PRIMITIVE_TYPE (BT_ULONGLONG, long_long_unsigned_type_node)
 DEF_PRIMITIVE_TYPE (BT_WORD, (*lang_hooks.types.type_for_mode) (word_mode, 1))
 DEF_PRIMITIVE_TYPE (BT_SIZE, size_type_node)
+DEF_PRIMITIVE_TYPE (BT_PTR_SIZED_INT, pointer_sized_int_node)
 
 DEF_PRIMITIVE_TYPE (BT_I1, builtin_type_for_size (BITS_PER_UNIT*1, 1))
 DEF_PRIMITIVE_TYPE (BT_I2, builtin_type_for_size (BITS_PER_UNIT*2, 1))
@@ -85,6 +86,7 @@ DEF_FUNCTION_TYPE_1 (BT_FN_VOID_VPTR, BT_VOID, 
BT_VOLATILE_PTR)
 DEF_FUNCTION_TYPE_1 (BT_FN_INT_INT, BT_INT, BT_INT)
 DEF_FUNCTION_TYPE_1 (BT_FN_UINT_UINT, BT_UINT, BT_UINT)
 DEF_FUNCTION_TYPE_1 (BT_FN_PTR_PTR, BT_PTR, BT_PTR)
+DEF_FUNCTION_TYPE_1 (BT_FN_PSINT_VOID, BT_PTR_SIZED_INT, BT_VOID)
 DEF_FUNCTION_TYPE_1 (BT_FN_VOID_INT, BT_VOID, BT_INT)
 DEF_FUNCTION_TYPE_1 (BT_FN_VOID_BOOL, BT_VOID, BT_BOOL)
 DEF_FUNCTION_TYPE_1 (BT_FN_BOOL_INT, BT_BOOL, BT_INT)
@@ -230,10 +232,6 @@ DEF_FUNCTION_TYPE_8 
(BT_FN_BOOL_UINT_ULLPTR_LONG_ULL_ULLPTR_ULLPTR_PTR_PTR,
                     BT_BOOL, BT_UINT, BT_PTR_ULONGLONG, BT_LONG, BT_ULONGLONG,
                     BT_PTR_ULONGLONG, BT_PTR_ULONGLONG, BT_PTR, BT_PTR)
 
-DEF_FUNCTION_TYPE_9 (BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR_INT,
-                    BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR,
-                    BT_PTR_FN_VOID_PTR_PTR, BT_LONG, BT_LONG,
-                    BT_BOOL, BT_UINT, BT_PTR, BT_INT)
 DEF_FUNCTION_TYPE_9 (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_PTR,
                     BT_VOID, BT_INT, BT_PTR_FN_VOID_PTR, BT_SIZE, BT_PTR,
                     BT_PTR, BT_PTR, BT_UINT, BT_PTR, BT_PTR)
@@ -241,6 +239,10 @@ DEF_FUNCTION_TYPE_9 
(BT_FN_BOOL_LONG_LONG_LONG_LONG_LONG_LONGPTR_LONGPTR_PTR_PTR
                     BT_BOOL, BT_LONG, BT_LONG, BT_LONG, BT_LONG, BT_LONG,
                     BT_PTR_LONG, BT_PTR_LONG, BT_PTR, BT_PTR)
 
+DEF_FUNCTION_TYPE_10 
(BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR_INT_PSINT,
+                     BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR,
+                     BT_PTR_FN_VOID_PTR_PTR, BT_LONG, BT_LONG,
+                     BT_BOOL, BT_UINT, BT_PTR, BT_INT, BT_PTR_SIZED_INT)
 DEF_FUNCTION_TYPE_10 
(BT_FN_BOOL_BOOL_ULL_ULL_ULL_LONG_ULL_ULLPTR_ULLPTR_PTR_PTR,
                      BT_BOOL, BT_BOOL, BT_ULONGLONG, BT_ULONGLONG,
                      BT_ULONGLONG, BT_LONG, BT_ULONGLONG, BT_PTR_ULONGLONG,
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index aa3b914..dc41805 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -9523,6 +9523,10 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq 
*pre_p,
            }
          break;
 
+       case OMP_CLAUSE_DETACH:
+         decl = OMP_CLAUSE_DECL (c);
+         goto do_notice;
+
        case OMP_CLAUSE_IF:
          if (OMP_CLAUSE_IF_MODIFIER (c) != ERROR_MARK
              && OMP_CLAUSE_IF_MODIFIER (c) != code)
@@ -10621,6 +10625,7 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, 
gimple_seq body, tree *list_p,
        case OMP_CLAUSE_DEFAULTMAP:
        case OMP_CLAUSE_ORDER:
        case OMP_CLAUSE_BIND:
+       case OMP_CLAUSE_DETACH:
        case OMP_CLAUSE_USE_DEVICE_PTR:
        case OMP_CLAUSE_USE_DEVICE_ADDR:
        case OMP_CLAUSE_IS_DEVICE_PTR:
diff --git a/gcc/omp-builtins.def b/gcc/omp-builtins.def
index f461d60..c883ec8 100644
--- a/gcc/omp-builtins.def
+++ b/gcc/omp-builtins.def
@@ -379,7 +379,7 @@ DEF_GOMP_BUILTIN (BUILT_IN_GOMP_PARALLEL_REDUCTIONS,
                  "GOMP_parallel_reductions",
                  BT_FN_UINT_OMPFN_PTR_UINT_UINT, ATTR_NOTHROW_LIST)
 DEF_GOMP_BUILTIN (BUILT_IN_GOMP_TASK, "GOMP_task",
-                 BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR_INT,
+                 
BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR_INT_PSINT,
                  ATTR_NOTHROW_LIST)
 DEF_GOMP_BUILTIN (BUILT_IN_GOMP_TASKLOOP, "GOMP_taskloop",
                  
BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_UINT_LONG_INT_LONG_LONG_LONG,
@@ -446,3 +446,5 @@ DEF_GOMP_BUILTIN 
(BUILT_IN_GOMP_WORKSHARE_TASK_REDUCTION_UNREGISTER,
                  BT_FN_VOID_BOOL, ATTR_NOTHROW_LEAF_LIST)
 DEF_GOACC_BUILTIN (BUILT_IN_GOACC_DECLARE, "GOACC_declare",
                   BT_FN_VOID_INT_SIZE_PTR_PTR_PTR, ATTR_NOTHROW_LIST)
+DEF_GOMP_BUILTIN (BUILT_IN_GOMP_NEW_EVENT, "GOMP_new_event",
+                 BT_FN_PSINT_VOID, ATTR_NOTHROW_LEAF_LIST)
diff --git a/gcc/omp-expand.c b/gcc/omp-expand.c
index 6583c88..9f979ac 100644
--- a/gcc/omp-expand.c
+++ b/gcc/omp-expand.c
@@ -762,6 +762,7 @@ expand_task_call (struct omp_region *region, basic_block bb,
   tree depend = omp_find_clause (clauses, OMP_CLAUSE_DEPEND);
   tree finalc = omp_find_clause (clauses, OMP_CLAUSE_FINAL);
   tree priority = omp_find_clause (clauses, OMP_CLAUSE_PRIORITY);
+  tree detach = omp_find_clause (clauses, OMP_CLAUSE_DETACH);
 
   unsigned int iflags
     = (untied ? GOMP_TASK_FLAG_UNTIED : 0)
@@ -853,6 +854,11 @@ expand_task_call (struct omp_region *region, basic_block 
bb,
     priority = integer_zero_node;
 
   gsi = gsi_last_nondebug_bb (bb);
+
+  detach = detach
+      ? fold_convert (pointer_sized_int_node, OMP_CLAUSE_DETACH_EXPR (detach))
+      : null_pointer_node;
+
   tree t = gimple_omp_task_data_arg (entry_stmt);
   if (t == NULL)
     t2 = null_pointer_node;
@@ -875,10 +881,10 @@ expand_task_call (struct omp_region *region, basic_block 
bb,
                         num_tasks, priority, startvar, endvar, step);
   else
     t = build_call_expr (builtin_decl_explicit (BUILT_IN_GOMP_TASK),
-                        9, t1, t2, t3,
+                        10, t1, t2, t3,
                         gimple_omp_task_arg_size (entry_stmt),
                         gimple_omp_task_arg_align (entry_stmt), cond, flags,
-                        depend, priority);
+                        depend, priority, detach);
 
   force_gimple_operand_gsi (&gsi, t, true, NULL_TREE,
                            false, GSI_CONTINUE_LINKING);
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index ea9008b..c5221f6 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -1313,6 +1313,14 @@ scan_sharing_clauses (tree clauses, omp_context *ctx)
          install_var_field (decl, by_ref, 3, ctx);
          break;
 
+       case OMP_CLAUSE_DETACH:
+         decl = OMP_CLAUSE_DECL (c);
+         install_var_field (decl, true, 3, ctx);
+         install_var_local (decl, ctx);
+         if (ctx->outer)
+           scan_omp_op (&OMP_CLAUSE_OPERAND (c, 0), ctx->outer);
+         break;
+
        case OMP_CLAUSE_FINAL:
        case OMP_CLAUSE_IF:
        case OMP_CLAUSE_NUM_THREADS:
@@ -1654,6 +1662,7 @@ scan_sharing_clauses (tree clauses, omp_context *ctx)
        case OMP_CLAUSE_SIMDLEN:
        case OMP_CLAUSE_ALIGNED:
        case OMP_CLAUSE_DEPEND:
+       case OMP_CLAUSE_DETACH:
        case OMP_CLAUSE_ALLOCATE:
        case OMP_CLAUSE__LOOPTEMP_:
        case OMP_CLAUSE__REDUCTEMP_:
@@ -11094,6 +11103,26 @@ create_task_copyfn (gomp_task *task_stmt, omp_context 
*ctx)
 }
 
 static void
+lower_detach_clause (tree *pclauses, gimple_seq *iseq, omp_context *ctx)
+{
+  tree clause = omp_find_clause (*pclauses, OMP_CLAUSE_DETACH);
+  gcc_assert (clause);
+
+  tree event_decl = OMP_CLAUSE_DECL (clause);
+  tree event_ref = lookup_decl_in_outer_ctx (event_decl, ctx);
+  tree fn_decl = builtin_decl_explicit (BUILT_IN_GOMP_NEW_EVENT);
+  tree handle = create_tmp_var (pointer_sized_int_node);
+
+  gimple *call_stmt = gimple_build_call (fn_decl, 0);
+  gimple_call_set_lhs (call_stmt, handle);
+  gimple_seq_add_stmt (iseq, call_stmt);
+
+  gimplify_assign (event_ref,
+                  fold_convert (TREE_TYPE (event_decl), handle),
+                  iseq);
+}
+
+static void
 lower_depend_clauses (tree *pclauses, gimple_seq *iseq, gimple_seq *oseq)
 {
   tree c, clauses;
@@ -11242,6 +11271,15 @@ lower_omp_taskreg (gimple_stmt_iterator *gsi_p, 
omp_context *ctx)
       if (ws_num == 1)
        gimple_omp_parallel_set_combined_p (stmt, true);
     }
+
+  gimple_seq detach_ilist = NULL;
+  if (gimple_code (stmt) == GIMPLE_OMP_TASK
+      && omp_find_clause (clauses, OMP_CLAUSE_DETACH))
+    {
+      lower_detach_clause (gimple_omp_task_clauses_ptr (stmt), &detach_ilist,
+                          ctx);
+    }
+
   gimple_seq dep_ilist = NULL;
   gimple_seq dep_olist = NULL;
   if (gimple_code (stmt) == GIMPLE_OMP_TASK
@@ -11319,6 +11357,10 @@ lower_omp_taskreg (gimple_stmt_iterator *gsi_p, 
omp_context *ctx)
 
   gimple_seq olist = NULL;
   gimple_seq ilist = NULL;
+
+  if (detach_ilist)
+    gimple_seq_add_seq (&ilist, detach_ilist);
+
   lower_send_clauses (clauses, &ilist, &olist, ctx);
   lower_send_shared_vars (&ilist, &olist, ctx);
 
diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index c9280a8..54a436b 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -301,6 +301,9 @@ enum omp_clause_code {
   /* OpenMP clause: to (variable-list).  */
   OMP_CLAUSE_TO,
 
+  /* OpenMP clause: detach (event-handle).  */
+  OMP_CLAUSE_DETACH,
+
   /* OpenACC clauses: {copy, copyin, copyout, create, delete, deviceptr,
      device, host (self), present, present_or_copy (pcopy), present_or_copyin
      (pcopyin), present_or_copyout (pcopyout), present_or_create (pcreate)}
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 318f048..f0fef6a 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -1265,6 +1265,12 @@ dump_omp_clause (pretty_printer *pp, tree clause, int 
spc, dump_flags_t flags)
     case OMP_CLAUSE_FINALIZE:
       pp_string (pp, "finalize");
       break;
+    case OMP_CLAUSE_DETACH:
+      pp_string (pp, "detach(");
+      dump_generic_node (pp, OMP_CLAUSE_DETACH_EXPR (clause), spc, flags,
+                        false);
+      pp_right_paren (pp);
+      break;
 
     default:
       gcc_unreachable ();
diff --git a/gcc/tree.c b/gcc/tree.c
index 9260772..1c8baed 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -299,6 +299,7 @@ unsigned const char omp_clause_num_ops[] =
   1, /* OMP_CLAUSE_LINK  */
   2, /* OMP_CLAUSE_FROM  */
   2, /* OMP_CLAUSE_TO  */
+  1, /* OMP_CLAUSE_DETACH  */
   2, /* OMP_CLAUSE_MAP  */
   1, /* OMP_CLAUSE_USE_DEVICE_PTR  */
   1, /* OMP_CLAUSE_USE_DEVICE_ADDR  */
@@ -384,6 +385,7 @@ const char * const omp_clause_code_name[] =
   "link",
   "from",
   "to",
+  "detach",
   "map",
   "use_device_ptr",
   "use_device_addr",
@@ -12176,6 +12178,7 @@ walk_tree_1 (tree *tp, walk_tree_fn func, void *data,
        case OMP_CLAUSE_HINT:
        case OMP_CLAUSE_TO_DECLARE:
        case OMP_CLAUSE_LINK:
+       case OMP_CLAUSE_DETACH:
        case OMP_CLAUSE_USE_DEVICE_PTR:
        case OMP_CLAUSE_USE_DEVICE_ADDR:
        case OMP_CLAUSE_IS_DEVICE_PTR:
diff --git a/gcc/tree.h b/gcc/tree.h
index f8f0a60..5bf1c1d 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -1585,6 +1585,9 @@ class auto_suppress_location_wrappers
 #define OMP_CLAUSE_PRIORITY_EXPR(NODE) \
   OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_PRIORITY),0)
 
+#define OMP_CLAUSE_DETACH_EXPR(NODE) \
+  OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_DETACH),0)
+
 /* OpenACC clause expressions  */
 #define OMP_CLAUSE_EXPR(NODE, CLAUSE) \
   OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, CLAUSE), 0)
diff --git a/libgomp/fortran.c b/libgomp/fortran.c
index cd719f9..976b248 100644
--- a/libgomp/fortran.c
+++ b/libgomp/fortran.c
@@ -605,6 +605,12 @@ omp_get_max_task_priority_ (void)
 }
 
 void
+omp_fulfill_event_ (intptr_t event)
+{
+  omp_fulfill_event ((omp_event_handle_t) event);
+}
+
+void
 omp_set_affinity_format_ (const char *format, size_t format_len)
 {
   gomp_set_affinity_format (format, format_len);
diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
index da7ac03..d9d2514 100644
--- a/libgomp/libgomp.h
+++ b/libgomp/libgomp.h
@@ -517,6 +517,12 @@ struct gomp_taskwait
   gomp_sem_t taskwait_sem;
 };
 
+struct gomp_allow_completion_event
+{
+  bool fulfilled;
+  gomp_sem_t completion_sem;
+};
+
 /* This structure describes a "task" to be run by a thread.  */
 
 struct gomp_task
@@ -546,6 +552,8 @@ struct gomp_task
      entries and the gomp_task in which they reside.  */
   struct priority_node pnode[3];
 
+  struct gomp_allow_completion_event *detach_event;
+
   struct gomp_task_icv icv;
   void (*fn) (void *);
   void *fn_data;
@@ -686,6 +694,10 @@ struct gomp_team
   int work_share_cancelled;
   int team_cancelled;
 
+  /* Tasks waiting for their completion event to be fulfilled.  */
+  struct priority_queue task_detach_queue;
+  unsigned int task_detach_count;
+
   /* This array contains structures for implicit tasks.  */
   struct gomp_task implicit_task[];
 };
@@ -932,6 +944,8 @@ gomp_finish_task (struct gomp_task *task)
 {
   if (__builtin_expect (task->depend_hash != NULL, 0))
     free (task->depend_hash);
+  if (task->detach_event)
+    free (task->detach_event);
 }
 
 /* team.c */
diff --git a/libgomp/libgomp.map b/libgomp/libgomp.map
index c5f52f7..d29e05b 100644
--- a/libgomp/libgomp.map
+++ b/libgomp/libgomp.map
@@ -195,6 +195,8 @@ OMP_5.0.1 {
        omp_free;
        omp_get_supported_active_levels;
        omp_get_supported_active_levels_;
+       omp_fulfill_event;
+       omp_fulfill_event_;
 } OMP_5.0;
 
 GOMP_1.0 {
@@ -347,6 +349,7 @@ GOMP_5.0 {
        GOMP_loop_ull_nonmonotonic_runtime_start;
        GOMP_loop_ull_ordered_start;
        GOMP_loop_ull_start;
+       GOMP_new_event;
        GOMP_parallel_loop_maybe_nonmonotonic_runtime;
        GOMP_parallel_loop_nonmonotonic_runtime;
        GOMP_parallel_reductions;
diff --git a/libgomp/libgomp_g.h b/libgomp/libgomp_g.h
index 59e3697..670cb2d 100644
--- a/libgomp/libgomp_g.h
+++ b/libgomp/libgomp_g.h
@@ -293,8 +293,9 @@ extern bool GOMP_cancellation_point (int);
 
 /* task.c */
 
+extern uintptr_t GOMP_new_event (void);
 extern void GOMP_task (void (*) (void *), void *, void (*) (void *, void *),
-                      long, long, bool, unsigned, void **, int);
+                      long, long, bool, unsigned, void **, int, uintptr_t);
 extern void GOMP_taskloop (void (*) (void *), void *,
                           void (*) (void *, void *), long, long, unsigned,
                           unsigned long, int, long, long, long);
diff --git a/libgomp/omp.h.in b/libgomp/omp.h.in
index be7df6d..b7c3eea 100644
--- a/libgomp/omp.h.in
+++ b/libgomp/omp.h.in
@@ -171,6 +171,11 @@ typedef struct omp_alloctrait_t
   omp_uintptr_t value;
 } omp_alloctrait_t;
 
+typedef enum omp_event_handle_t __GOMP_UINTPTR_T_ENUM
+{
+  __omp_event_handle_t_max__ = __UINTPTR_MAX__
+} omp_event_handle_t;
+
 #ifdef __cplusplus
 extern "C" {
 # define __GOMP_NOTHROW throw ()
@@ -245,6 +250,8 @@ extern int omp_is_initial_device (void) __GOMP_NOTHROW;
 extern int omp_get_initial_device (void) __GOMP_NOTHROW;
 extern int omp_get_max_task_priority (void) __GOMP_NOTHROW;
 
+extern void omp_fulfill_event (omp_event_handle_t) __GOMP_NOTHROW;
+
 extern void *omp_target_alloc (__SIZE_TYPE__, int) __GOMP_NOTHROW;
 extern void omp_target_free (void *, int) __GOMP_NOTHROW;
 extern int omp_target_is_present (const void *, int) __GOMP_NOTHROW;
diff --git a/libgomp/omp_lib.f90.in b/libgomp/omp_lib.f90.in
index 3b7f0cb..7b70d8b 100644
--- a/libgomp/omp_lib.f90.in
+++ b/libgomp/omp_lib.f90.in
@@ -39,6 +39,7 @@
         integer, parameter :: omp_alloctrait_val_kind = c_intptr_t
         integer, parameter :: omp_memspace_handle_kind = c_intptr_t
         integer, parameter :: omp_depend_kind = @OMP_DEPEND_KIND@
+        integer, parameter :: omp_event_handle_kind = c_intptr_t
         integer (omp_sched_kind), parameter :: omp_sched_static = 1
         integer (omp_sched_kind), parameter :: omp_sched_dynamic = 2
         integer (omp_sched_kind), parameter :: omp_sched_guided = 3
@@ -556,6 +557,13 @@
         end interface
 
         interface
+          subroutine omp_fulfill_event (event)
+            use omp_lib_kinds
+            integer (kind=omp_event_handle_kind), value, intent(in) :: event
+          end subroutine omp_fulfill_event
+        end interface
+
+        interface
           subroutine omp_set_affinity_format (format)
             character(len=*), intent(in) :: format
           end subroutine omp_set_affinity_format
diff --git a/libgomp/omp_lib.h.in b/libgomp/omp_lib.h.in
index eb1dcc4..5b4053f 100644
--- a/libgomp/omp_lib.h.in
+++ b/libgomp/omp_lib.h.in
@@ -82,10 +82,12 @@
 
       integer omp_allocator_handle_kind, omp_alloctrait_key_kind
       integer omp_alloctrait_val_kind, omp_memspace_handle_kind
+      integer omp_event_handle_kind
       parameter (omp_allocator_handle_kind = @INTPTR_T_KIND@)
       parameter (omp_alloctrait_key_kind = 4)
       parameter (omp_alloctrait_val_kind = @INTPTR_T_KIND@)
       parameter (omp_memspace_handle_kind = @INTPTR_T_KIND@)
+      parameter (omp_event_handle_kind = @INTPTR_T_KIND@)
       integer (omp_alloctrait_key_kind) omp_atk_sync_hint
       integer (omp_alloctrait_key_kind) omp_atk_alignment
       integer (omp_alloctrait_key_kind) omp_atk_access
@@ -245,6 +247,8 @@
       external omp_get_max_task_priority
       integer(4) omp_get_max_task_priority
 
+      external omp_fulfill_event
+
       external omp_set_affinity_format, omp_get_affinity_format
       external omp_display_affinity, omp_capture_affinity
       integer(4) omp_get_affinity_format
diff --git a/libgomp/task.c b/libgomp/task.c
index a95067c..a09a133 100644
--- a/libgomp/task.c
+++ b/libgomp/task.c
@@ -86,6 +86,7 @@ gomp_init_task (struct gomp_task *task, struct gomp_task 
*parent_task,
   task->dependers = NULL;
   task->depend_hash = NULL;
   task->depend_count = 0;
+  task->detach_event = NULL;
 }
 
 /* Clean up a task, after completing it.  */
@@ -326,6 +327,21 @@ gomp_task_handle_depend (struct gomp_task *task, struct 
gomp_task *parent,
     }
 }
 
+uintptr_t
+GOMP_new_event ()
+{
+  struct gomp_allow_completion_event *event;
+
+  event = (struct gomp_allow_completion_event *)
+           gomp_malloc (sizeof (struct gomp_allow_completion_event));
+  event->fulfilled = false;
+  gomp_sem_init (&event->completion_sem, 0);
+
+  gomp_debug (0, "GOMP_new_event: %p\n", event);
+
+  return (uintptr_t) event;
+}
+
 /* Called when encountering an explicit task directive.  If IF_CLAUSE is
    false, then we must not delay in executing the task.  If UNTIED is true,
    then the task may be executed by any member of the team.
@@ -347,11 +363,14 @@ gomp_task_handle_depend (struct gomp_task *task, struct 
gomp_task *parent,
 void
 GOMP_task (void (*fn) (void *), void *data, void (*cpyfn) (void *, void *),
           long arg_size, long arg_align, bool if_clause, unsigned flags,
-          void **depend, int priority)
+          void **depend, int priority, uintptr_t detach)
 {
   struct gomp_thread *thr = gomp_thread ();
   struct gomp_team *team = thr->ts.team;
 
+  struct gomp_allow_completion_event *detach_event =
+    detach ? (struct gomp_allow_completion_event *) detach : NULL;
+
 #ifdef HAVE_BROKEN_POSIX_SEMAPHORES
   /* If pthread_mutex_* is used for omp_*lock*, then each task must be
      tied to one thread all the time.  This means UNTIED tasks must be
@@ -404,6 +423,10 @@ GOMP_task (void (*fn) (void *), void *data, void (*cpyfn) 
(void *, void *),
       task.final_task = (thr->task && thr->task->final_task)
                        || (flags & GOMP_TASK_FLAG_FINAL);
       task.priority = priority;
+
+      if (detach)
+       task.detach_event = detach_event;
+
       if (thr->task)
        {
          task.in_tied_task = thr->task->in_tied_task;
@@ -420,6 +443,10 @@ GOMP_task (void (*fn) (void *), void *data, void (*cpyfn) 
(void *, void *),
        }
       else
        fn (data);
+
+      if (detach)
+         gomp_sem_wait (&task.detach_event->completion_sem);
+
       /* Access to "children" is normally done inside a task_lock
         mutex region, but the only way this particular task.children
         can be set is if this thread's task work function (fn)
@@ -435,6 +462,7 @@ GOMP_task (void (*fn) (void *), void *data, void (*cpyfn) 
(void *, void *),
          gomp_clear_parent (&task.children_queue);
          gomp_mutex_unlock (&team->task_lock);
        }
+
       gomp_end_task ();
     }
   else
@@ -458,6 +486,8 @@ GOMP_task (void (*fn) (void *), void *data, void (*cpyfn) 
(void *, void *),
       task->kind = GOMP_TASK_UNDEFERRED;
       task->in_tied_task = parent->in_tied_task;
       task->taskgroup = taskgroup;
+      if (detach)
+       task->detach_event = detach_event;
       thr->task = task;
       if (cpyfn)
        {
@@ -1310,6 +1340,8 @@ gomp_barrier_handle_tasks (gomp_barrier_state_t state)
   int do_wake = 0;
 
   gomp_mutex_lock (&team->task_lock);
+  gomp_debug (0, "thread: %d, task_count %d\n",
+             thr->ts.team_id, team->task_count);
   if (gomp_barrier_last_thread (state))
     {
       if (team->task_count == 0)
@@ -1388,34 +1420,87 @@ gomp_barrier_handle_tasks (gomp_barrier_state_t state)
          thr->task = task;
        }
       else
-       return;
-      gomp_mutex_lock (&team->task_lock);
-      if (child_task)
        {
-        finish_cancelled:;
-         size_t new_tasks
-           = gomp_task_run_post_handle_depend (child_task, team);
-         gomp_task_run_post_remove_parent (child_task);
-         gomp_clear_parent (&child_task->children_queue);
-         gomp_task_run_post_remove_taskgroup (child_task);
-         to_free = child_task;
-         child_task = NULL;
-         if (!cancelled)
-           team->task_running_count--;
-         if (new_tasks > 1)
+         bool ignored;
+
+         gomp_mutex_lock (&team->task_lock);
+
+         if (priority_queue_empty_p (&team->task_detach_queue,
+                                     MEMMODEL_RELAXED))
            {
-             do_wake = team->nthreads - team->task_running_count;
-             if (do_wake > new_tasks)
-               do_wake = new_tasks;
+             gomp_mutex_unlock (&team->task_lock);
+             return;
            }
-         if (--team->task_count == 0
-             && gomp_team_barrier_waiting_for_tasks (&team->barrier))
+
+         child_task
+           = priority_queue_next_task (PQ_TEAM, &team->task_detach_queue,
+                                       PQ_IGNORED, NULL,
+                                       &ignored);
+         priority_queue_remove (PQ_TEAM, &team->task_detach_queue,
+                                child_task, MEMMODEL_RELAXED);
+         --team->task_detach_count;
+         if (!__atomic_load_n (&child_task->detach_event->fulfilled,
+                               __ATOMIC_RELAXED))
            {
-             gomp_team_barrier_done (&team->barrier, state);
+             /* Wait for detached task to finish.  */
              gomp_mutex_unlock (&team->task_lock);
-             gomp_team_barrier_wake (&team->barrier, 0);
+             gomp_debug (0,
+                         "thread: %d, waiting for event to be fulfilled %p\n",
+                         thr->ts.team_id, child_task->detach_event);
+             gomp_sem_wait (&child_task->detach_event->completion_sem);
              gomp_mutex_lock (&team->task_lock);
            }
+         else
+           gomp_debug (0, "thread: %d, queued event already fulfilled %p\n",
+                       thr->ts.team_id, child_task->detach_event);
+         goto finish_cancelled;
+       }
+      gomp_mutex_lock (&team->task_lock);
+      if (child_task)
+       {
+         if (child_task->detach_event
+             && !__atomic_load_n (&child_task->detach_event->fulfilled,
+                                  __ATOMIC_RELAXED))
+           {
+             priority_queue_insert (PQ_TEAM, &team->task_detach_queue,
+                                    child_task, child_task->priority,
+                                    PRIORITY_INSERT_END,
+                                    false, false);
+             ++team->task_detach_count;
+             gomp_debug (0, "thread: %d, queueing detached %p\n",
+                         thr->ts.team_id, child_task->detach_event);
+             child_task = NULL;
+           }
+         else
+           {
+             if (child_task->detach_event)
+               gomp_debug (0, "thread: %d, event already fulfilled %p\n",
+                           thr->ts.team_id, child_task->detach_event);
+            finish_cancelled:;
+             size_t new_tasks
+               = gomp_task_run_post_handle_depend (child_task, team);
+             gomp_task_run_post_remove_parent (child_task);
+             gomp_clear_parent (&child_task->children_queue);
+             gomp_task_run_post_remove_taskgroup (child_task);
+             to_free = child_task;
+             child_task = NULL;
+             if (!cancelled)
+               team->task_running_count--;
+             if (new_tasks > 1)
+               {
+                 do_wake = team->nthreads - team->task_running_count;
+                 if (do_wake > new_tasks)
+                   do_wake = new_tasks;
+               }
+             if (--team->task_count == 0
+                 && gomp_team_barrier_waiting_for_tasks (&team->barrier))
+               {
+                 gomp_team_barrier_done (&team->barrier, state);
+                 gomp_mutex_unlock (&team->task_lock);
+                 gomp_team_barrier_wake (&team->barrier, 0);
+                 gomp_mutex_lock (&team->task_lock);
+               }
+           }
        }
     }
 }
@@ -2326,3 +2411,18 @@ omp_in_final (void)
 }
 
 ialias (omp_in_final)
+
+void omp_fulfill_event(omp_event_handle_t event)
+{
+  struct gomp_allow_completion_event *ev =
+               (struct gomp_allow_completion_event *) event;
+
+  if (__atomic_load_n (&ev->fulfilled, __ATOMIC_RELAXED))
+    gomp_fatal ("omp_fulfill_enent: Event already fulfilled!\n");
+
+  gomp_debug(0, "omp_fulfill_event: %p\n", ev);
+  __atomic_store_n (&ev->fulfilled, true, __ATOMIC_RELAXED);
+  gomp_sem_post (&ev->completion_sem);
+}
+
+ialias (omp_fulfill_event)
diff --git a/libgomp/team.c b/libgomp/team.c
index cbc3aec..ee488f2 100644
--- a/libgomp/team.c
+++ b/libgomp/team.c
@@ -206,6 +206,9 @@ gomp_new_team (unsigned nthreads)
   team->work_share_cancelled = 0;
   team->team_cancelled = 0;
 
+  priority_queue_init (&team->task_detach_queue);
+  team->task_detach_count = 0;
+
   return team;
 }
 
@@ -221,6 +224,7 @@ free_team (struct gomp_team *team)
   gomp_barrier_destroy (&team->barrier);
   gomp_mutex_destroy (&team->task_lock);
   priority_queue_free (&team->task_queue);
+  priority_queue_free (&team->task_detach_queue);
   team_free (team);
 }
 
diff --git a/libgomp/testsuite/libgomp.c-c++-common/task-detach-1.c 
b/libgomp/testsuite/libgomp.c-c++-common/task-detach-1.c
new file mode 100644
index 0000000..7f2319c
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c-c++-common/task-detach-1.c
@@ -0,0 +1,39 @@
+/* { dg-do run } */
+
+#include <omp.h>
+#include <assert.h>
+
+omp_event_handle_t detach_event1, detach_event2;
+
+int main (void)
+{
+  int x = 0, y = 0, z = 0;
+
+  #pragma omp parallel
+  {
+    #pragma omp single
+    {
+      #pragma omp task detach(detach_event1)
+      {
+       x++;
+      }
+
+      #pragma omp task detach(detach_event2)
+      {
+       y++;
+       omp_fulfill_event (detach_event1);
+      }
+
+      #pragma omp task
+      {
+       z++;
+       omp_fulfill_event (detach_event2);
+      }
+    }
+    #pragma omp taskwait
+  }
+
+  assert (x == 1);
+  assert (y == 1);
+  assert (z == 1);
+}
diff --git a/libgomp/testsuite/libgomp.c-c++-common/task-detach-2.c 
b/libgomp/testsuite/libgomp.c-c++-common/task-detach-2.c
new file mode 100644
index 0000000..330c936
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c-c++-common/task-detach-2.c
@@ -0,0 +1,39 @@
+/* { dg-do run } */
+
+#include <omp.h>
+#include <assert.h>
+
+int main (void)
+{
+  int x = 0, y = 0, z = 0;
+
+  #pragma omp parallel num_threads(1)
+  {
+    #pragma omp single
+    {
+      omp_event_handle_t detach_event1, detach_event2;
+
+      #pragma omp task detach(detach_event1)
+      {
+       x++;
+      }
+
+      #pragma omp task detach(detach_event2)
+      {
+       y++;
+       omp_fulfill_event (detach_event1);
+      }
+
+      #pragma omp task
+      {
+       z++;
+       omp_fulfill_event (detach_event2);
+      }
+    }
+    #pragma omp taskwait
+  }
+
+  assert (x == 1);
+  assert (y == 1);
+  assert (z == 1);
+}
diff --git a/libgomp/testsuite/libgomp.c-c++-common/task-detach-3.c 
b/libgomp/testsuite/libgomp.c-c++-common/task-detach-3.c
new file mode 100644
index 0000000..a16f5336
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c-c++-common/task-detach-3.c
@@ -0,0 +1,39 @@
+/* { dg-do run } */
+
+#include <omp.h>
+#include <assert.h>
+
+int main (void)
+{
+  int x = 0, y = 0, z = 0;
+
+  #pragma omp parallel
+  {
+    #pragma omp single
+    {
+      omp_event_handle_t detach_event;
+      int dep;
+
+      #pragma omp task depend(out:dep) detach(detach_event)
+      {
+       x++;
+      }
+
+      #pragma omp task
+      {
+       y++;
+       omp_fulfill_event(detach_event);
+      }
+
+      #pragma omp task depend(in:dep)
+      {
+       z++;
+      }
+    }
+    #pragma omp taskwait
+  }
+
+  assert (x == 1);
+  assert (y == 1);
+  assert (z == 1);
+}
diff --git a/libgomp/testsuite/libgomp.fortran/task-detach-1.f90 
b/libgomp/testsuite/libgomp.fortran/task-detach-1.f90
new file mode 100644
index 0000000..20e3675
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/task-detach-1.f90
@@ -0,0 +1,33 @@
+! { dg-do run }
+
+program task_detach_1
+  use omp_lib
+
+  integer (kind=omp_event_handle_kind) :: detach_event1, detach_event2
+  integer :: x = 0, y = 0, z = 0
+
+  !$omp parallel
+    !$omp single
+
+      !$omp task detach(detach_event1)
+        x = x + 1
+      !$omp end task
+
+      !$omp task detach(detach_event2)
+        y = y + 1
+       call omp_fulfill_event (detach_event1)
+      !$omp end task
+
+      !$omp task
+        z = z + 1
+       call omp_fulfill_event (detach_event2)
+      !$omp end task
+    !$omp end single
+
+    !$omp taskwait
+  !$omp end parallel
+
+  if (x /= 1) stop 1
+  if (y /= 1) stop 2
+  if (z /= 1) stop 3
+end program
diff --git a/libgomp/testsuite/libgomp.fortran/task-detach-2.f90 
b/libgomp/testsuite/libgomp.fortran/task-detach-2.f90
new file mode 100644
index 0000000..bd0f016
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/task-detach-2.f90
@@ -0,0 +1,33 @@
+! { dg-do run }
+
+program task_detach_2
+  use omp_lib
+
+  integer (kind=omp_event_handle_kind) :: detach_event1, detach_event2
+  integer :: x = 0, y = 0, z = 0
+
+  !$omp parallel num_threads(1)
+    !$omp single
+
+      !$omp task detach(detach_event1)
+        x = x + 1
+      !$omp end task
+
+      !$omp task detach(detach_event2)
+        y = y + 1
+       call omp_fulfill_event (detach_event1)
+      !$omp end task
+
+      !$omp task
+        z = z + 1
+       call omp_fulfill_event (detach_event2)
+      !$omp end task
+    !$omp end single
+
+    !$omp taskwait
+  !$omp end parallel
+
+  if (x /= 1) stop 1
+  if (y /= 1) stop 2
+  if (z /= 1) stop 3
+end program
diff --git a/libgomp/testsuite/libgomp.fortran/task-detach-3.f90 
b/libgomp/testsuite/libgomp.fortran/task-detach-3.f90
new file mode 100644
index 0000000..8a2ae48
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/task-detach-3.f90
@@ -0,0 +1,33 @@
+! { dg-do run }
+
+program task_detach_3
+
+  use omp_lib
+
+  integer (kind=omp_event_handle_kind) :: detach_event
+  integer :: x = 0, y = 0, z = 0
+  integer :: dep
+
+  !$omp parallel
+    !$omp single
+      !$omp task depend(out:dep) detach(detach_event)
+        x = x + 1
+      !$omp end task
+
+      !$omp task
+        y = y + 1
+       call omp_fulfill_event(detach_event)
+      !$omp end task
+
+      !$omp task depend(in:dep)
+        z = z + 1
+      !$omp end task
+    !$omp end single
+
+    !$omp taskwait
+  !$omp end parallel
+
+  if (x /= 1) stop 1
+  if (y /= 1) stop 2
+  if (z /= 1) stop 3
+end program

Reply via email to