On 02/22/2016 07:34 AM, Cesar Philippidis wrote: > Ping. This patch still needs a review.
Ping. I've attached a rebased version of this patch. The omp-low.c bits haven't changed, but the test cases have since Thomas has been merging some of them from trunk. I separated the omp-low.c changes because the test cases are relatively large. This patch fixes PR70533 and PR70535. Basically it teaches lower_oacc_reductions how to cope with reference-type variables (PR70533) and not to remap the reduction variables on parallel constructs (PR70535). Is it OK for trunk? Cesar > On 02/09/2016 08:17 AM, Cesar Philippidis wrote: >> On 02/09/2016 07:33 AM, Nathan Sidwell wrote: >>> While I've not looked at the rest of the patch, this bit stood out: >>> >>>> +static bool >>>> +is_oacc_parallel_reduction (tree var, omp_context *ctx) >>>> +{ >>>> + if (!is_oacc_parallel (ctx)) >>>> + return false; >>>> + >>>> + tree clauses = gimple_omp_target_clauses (ctx->stmt); >>>> + >>>> + /* Don't install a local copy of the decl if it used >>>> + inside a acc parallel reduction. */ >>> >>> ^^ comment is misleading -- this routine's not installing anything >>> >>>> + if (is_oacc_parallel (ctx)) >>> >>> ^^ already checked above. >>> >>>> + for (tree c = clauses; c; c = OMP_CLAUSE_CHAIN (c)) >>>> + if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_REDUCTION >>>> + && OMP_CLAUSE_DECL (c) == var) >>>> + return true; >>>> + >>>> + return false; >>>> +} >>>> + >> >> Thanks for catching that. Those are artifacts from when this code used >> to be located exclusively in scan_sharing_clauses. I've updated the >> patch with those changes. >> >> Cesar >> >
2016-04-05 Cesar Philippidis <ce...@codesourcery.com> gcc/ * omp-low.c (is_oacc_parallel_reduction): New function. (scan_sharing_clauses): Use it to prevent installing local variables for those used in acc parallel reductions. (lower_rec_input_clauses): Remove dead code. (lower_oacc_reductions): Add support for reference reductions. (lower_reduction_clauses): Remove dead code. (lower_omp_target): Don't remap variables appearing in acc parallel reductions. gcc/testsuite/ * gfortran.dg/goacc/reduction-promotions.f90: Add more coverage. libgomp/ * testsuite/libgomp.oacc-c-c++-common/loop-reduction-gang-np-1.c: New test. * testsuite/libgomp.oacc-c-c++-common/loop-reduction-gv-np-1.c: New test. * testsuite/libgomp.oacc-c-c++-common/loop-reduction-gw-np-1.c: New test. * testsuite/libgomp.oacc-c-c++-common/loop-reduction-gwv-np-1.c: New test. * testsuite/libgomp.oacc-c-c++-common/loop-reduction-gwv-np-2.c: New test. * testsuite/libgomp.oacc-c-c++-common/loop-reduction-gwv-np-3.c: New test. * testsuite/libgomp.oacc-c-c++-common/loop-reduction-gwv-np-4.c: New test. * testsuite/libgomp.oacc-c-c++-common/loop-reduction-vector-p-1.c: New test. * testsuite/libgomp.oacc-c-c++-common/loop-reduction-vector-p-2.c: New test. * testsuite/libgomp.oacc-c-c++-common/loop-reduction-worker-p-1.c: New test. * testsuite/libgomp.oacc-c-c++-common/loop-reduction-wv-p-1.c: New test. * testsuite/libgomp.oacc-c-c++-common/loop-reduction-wv-p-2.c: New test. * testsuite/libgomp.oacc-c-c++-common/loop-reduction-wv-p-3.c: New test. * testsuite/libgomp.oacc-c-c++-common/par-loop-comb-reduction-1.c: New test. * testsuite/libgomp.oacc-c-c++-common/par-loop-comb-reduction-2.c: New test. * testsuite/libgomp.oacc-c-c++-common/par-loop-comb-reduction-3.c: New test. * testsuite/libgomp.oacc-c-c++-common/par-loop-comb-reduction-4.c: New test. * testsuite/libgomp.oacc-c-c++-common/par-reduction-1.c: Adjust test. * testsuite/libgomp.oacc-c-c++-common/par-reduction-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/parallel-dims.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/parallel-reduction.c: New test. * testsuite/libgomp.oacc-c-c++-common/reduction-1.c: Adjust test. * testsuite/libgomp.oacc-c-c++-common/reduction-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/reduction-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/reduction-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/reduction-5.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/reduction-6.c: New test. * testsuite/libgomp.oacc-c-c++-common/reduction.h: New test. * testsuite/libgomp.oacc-fortran/parallel-reduction.f90: New test. * testsuite/libgomp.oacc-fortran/reduction-1.f90: Adjust test. * testsuite/libgomp.oacc-fortran/reduction-2.f90: Likewise. * testsuite/libgomp.oacc-fortran/reduction-3.f90: Likewise. * testsuite/libgomp.oacc-fortran/reduction-4.f90: Likewise. * testsuite/libgomp.oacc-fortran/reduction-5.f90: Likewise. * testsuite/libgomp.oacc-fortran/reduction-6.f90: Likewise. * testsuite/libgomp.oacc-fortran/reduction-7.f90: New test. diff --git a/gcc/omp-low.c b/gcc/omp-low.c index 3fd6eb3..fa2d318 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -309,6 +309,25 @@ is_oacc_kernels (omp_context *ctx) == GF_OMP_TARGET_KIND_OACC_KERNELS)); } +/* Return true if CTX corresponds to an oacc parallel region and if + VAR is used in a reduction. */ + +static bool +is_oacc_parallel_reduction (tree var, omp_context *ctx) +{ + if (!is_oacc_parallel (ctx)) + return false; + + tree clauses = gimple_omp_target_clauses (ctx->stmt); + + for (tree c = clauses; c; c = OMP_CLAUSE_CHAIN (c)) + if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_REDUCTION + && OMP_CLAUSE_DECL (c) == var) + return true; + + return false; +} + /* If DECL is the artificial dummy VAR_DECL created for non-static data member privatization, return the underlying "this" parameter, otherwise return NULL. */ @@ -2122,7 +2141,8 @@ scan_sharing_clauses (tree clauses, omp_context *ctx, else install_var_field (decl, true, 3, ctx, base_pointers_restrict); - if (is_gimple_omp_offloaded (ctx->stmt)) + if (is_gimple_omp_offloaded (ctx->stmt) + && !is_oacc_parallel_reduction (decl, ctx)) install_var_local (decl, ctx); } } @@ -4837,7 +4857,7 @@ lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist, gimplify_assign (ptr, x, ilist); } } - else if (is_reference (var) && !is_oacc_parallel (ctx)) + else if (is_reference (var)) { /* For references that are being privatized for Fortran, allocate new backing storage for the new pointer @@ -5573,7 +5593,8 @@ lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner, tree orig = OMP_CLAUSE_DECL (c); tree var = maybe_lookup_decl (orig, ctx); tree ref_to_res = NULL_TREE; - tree incoming, outgoing; + tree incoming, outgoing, v1, v2, v3; + bool is_private = false; enum tree_code rcode = OMP_CLAUSE_REDUCTION_CODE (c); if (rcode == MINUS_EXPR) @@ -5586,7 +5607,6 @@ lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner, if (!var) var = orig; - gcc_assert (!is_reference (var)); incoming = outgoing = var; @@ -5622,22 +5642,38 @@ lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner, for (; cls; cls = OMP_CLAUSE_CHAIN (cls)) if (OMP_CLAUSE_CODE (cls) == OMP_CLAUSE_REDUCTION && orig == OMP_CLAUSE_DECL (cls)) - goto has_outer_reduction; + { + incoming = outgoing = lookup_decl (orig, probe); + goto has_outer_reduction; + } + else if ((OMP_CLAUSE_CODE (cls) == OMP_CLAUSE_FIRSTPRIVATE + || OMP_CLAUSE_CODE (cls) == OMP_CLAUSE_PRIVATE) + && orig == OMP_CLAUSE_DECL (cls)) + { + is_private = true; + goto do_lookup; + } } do_lookup: /* This is the outermost construct with this reduction, see if there's a mapping for it. */ if (gimple_code (outer->stmt) == GIMPLE_OMP_TARGET - && maybe_lookup_field (orig, outer)) + && maybe_lookup_field (orig, outer) && !is_private) { ref_to_res = build_receiver_ref (orig, false, outer); if (is_reference (orig)) ref_to_res = build_simple_mem_ref (ref_to_res); + tree type = TREE_TYPE (var); + if (POINTER_TYPE_P (type)) + type = TREE_TYPE (type); + outgoing = var; - incoming = omp_reduction_init_op (loc, rcode, TREE_TYPE (var)); + incoming = omp_reduction_init_op (loc, rcode, type); } + else if (ctx->outer) + incoming = outgoing = lookup_decl (orig, ctx->outer); else incoming = outgoing = orig; @@ -5647,6 +5683,37 @@ lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner, if (!ref_to_res) ref_to_res = integer_zero_node; + if (is_reference (orig)) + { + tree type = TREE_TYPE (var); + const char *id = IDENTIFIER_POINTER (DECL_NAME (var)); + + if (!inner) + { + tree x = create_tmp_var (TREE_TYPE (type), id); + gimplify_assign (var, build_fold_addr_expr (x), fork_seq); + } + + v1 = create_tmp_var (type, id); + v2 = create_tmp_var (type, id); + v3 = create_tmp_var (type, id); + + gimplify_assign (v1, var, fork_seq); + gimplify_assign (v2, var, fork_seq); + gimplify_assign (v3, var, fork_seq); + + var = build_simple_mem_ref (var); + v1 = build_simple_mem_ref (v1); + v2 = build_simple_mem_ref (v2); + v3 = build_simple_mem_ref (v3); + outgoing = build_simple_mem_ref (outgoing); + + if (TREE_CODE (incoming) != INTEGER_CST) + incoming = build_simple_mem_ref (incoming); + } + else + v1 = v2 = v3 = var; + /* Determine position in reduction buffer, which may be used by target. */ enum machine_mode mode = TYPE_MODE (TREE_TYPE (var)); @@ -5676,20 +5743,20 @@ lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner, = build_call_expr_internal_loc (loc, IFN_GOACC_REDUCTION, TREE_TYPE (var), 6, init_code, unshare_expr (ref_to_res), - var, level, op, off); + v1, level, op, off); tree fini_call = build_call_expr_internal_loc (loc, IFN_GOACC_REDUCTION, TREE_TYPE (var), 6, fini_code, unshare_expr (ref_to_res), - var, level, op, off); + v2, level, op, off); tree teardown_call = build_call_expr_internal_loc (loc, IFN_GOACC_REDUCTION, TREE_TYPE (var), 6, teardown_code, - ref_to_res, var, level, op, off); + ref_to_res, v3, level, op, off); - gimplify_assign (var, setup_call, &before_fork); - gimplify_assign (var, init_call, &after_fork); - gimplify_assign (var, fini_call, &before_join); + gimplify_assign (v1, setup_call, &before_fork); + gimplify_assign (v2, init_call, &after_fork); + gimplify_assign (v3, fini_call, &before_join); gimplify_assign (outgoing, teardown_call, &after_join); } @@ -5931,9 +5998,6 @@ lower_reduction_clauses (tree clauses, gimple_seq *stmt_seqp, omp_context *ctx) } } - if (is_gimple_omp_oacc (ctx->stmt)) - return; - stmt = gimple_build_call (builtin_decl_explicit (BUILT_IN_GOMP_ATOMIC_START), 0); gimple_seq_add_stmt (stmt_seqp, stmt); @@ -15820,7 +15884,9 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx) if (!maybe_lookup_field (var, ctx)) continue; - if (offloaded) + /* Don't remap oacc parallel reduction variables, because the + intermediate result must be local to each gang. */ + if (offloaded && !is_oacc_parallel_reduction (var, ctx)) { x = build_receiver_ref (var, true, ctx); tree new_var = lookup_decl (var, ctx);
pr70533-tests.diff.gz
Description: application/gzip