Hi, yesterday I forgot to post here a patch I committed to the HSA branch which hopefully addresses all of the issues raised in the review:
- kernel_phony flags were turned into bits in gimple subcode, with the happy consequence that GIMPLE_OMP_TEAMS no longer needs its own storage layout. - GIMPLE_OMP_GPUKERNEL got renamed to GIMPLE_OMP_GRID_BODY, GF_OMP_FOR_KIND_KERNEL_BODY to GF_OMP_FOR_KIND_GRID_LOOP and the phony_kernel flag into enum items like GF_OMP_PARALLEL_GRID_PHONY. - Five new GTY roots were combined into one. - NUM_TEAMS and THREAD_LIMIT are passed in one "device specific" argument if they are small constants and in two otherwise. - All gridification-specific functions were prefixed with grid_. I suppose I could move the part of gridification that happens even before OMP lowering to a special file but that would be something for another patch. - "griddim" was changed to "_griddim_" at three places - I fixed formatting in all the suggested ways. Thanks, Martin 2015-12-16 Martin Jambor <mjam...@suse.cz> gcc/ * builtin-types.def: Removed a blank line. * gimple-low.c (lower_stmt): Changed GIMPLE_OMP_GPUKERNEL to GIMPLE_OMP_GRID_BODY. * gimple-pretty-print.c (dump_gimple_omp_for): Changed GF_OMP_FOR_KIND_KERNEL_BODY to GF_OMP_FOR_KIND_GRID_LOOP. (dump_gimple_omp_block): Changed GIMPLE_OMP_GPUKERNEL to GIMPLE_OMP_GRID_BODY. (pp_gimple_stmt_1): Likewise. * gimple-walk.c (walk_gimple_stmt): Likewise. * gimple.c (gimple_build_omp_gpukernel): Renamed to gimple_build_omp_grid_body. Changed GIMPLE_OMP_GPUKERNEL to GIMPLE_OMP_GRID_BODY. (gimple_copy): Changed GIMPLE_OMP_GPUKERNEL to GIMPLE_OMP_GRID_BODY. * gimple.def (GIMPLE_OMP_TEAMS): Changed back to GSS_OMP_SINGLE_LAYOUT. (GIMPLE_OMP_GPUKERNEL): Renamed to GIMPLE_OMP_GRID_BODY. * gimple.h (gf_mask): Changed GF_OMP_FOR_KIND_KERNEL_BODY to GF_OMP_FOR_KIND_GRID_LOOP. New elements GF_OMP_PARALLEL_GRID_PHONY, GF_OMP_FOR_GRID_PHONY and GF_OMP_TEAMS_GRID_PHONY. (gomp_for): Removed field kernel_phony. (gimple_statement_omp_parallel_layout): Likewise. (gomp_teams): Changed back to GSS_OMP_SINGLE_LAYOUT. Removed field kernel_phony. (gimple_has_substatements): Changed GIMPLE_OMP_GPUKERNEL to GIMPLE_OMP_GRID_BODY. (gimple_omp_for_kernel_phony): Renamed to gimple_omp_for_grid_phony, work on gimple subcode. (gimple_omp_for_set_kernel_phony): Renamed to gimple_omp_for_set_grid_phony, work on gimple subcode. (gimple_omp_parallel_kernel_phony): Renamed to gimple_omp_parallel_grid_phony, work on gimple subcode. (gimple_omp_parallel_set_kernel_phony): Renamed to gimple_omp_parallel_set_grid_phony, work on gimple subcode. (gimple_omp_teams_kernel_phony): Renamed to gimple_omp_teams_grid_phony, work on gimple subcode. (gimple_omp_teams_set_kernel_phony): Renamed to gimple_omp_teams_set_grid_phony, work on gimple subcode. (CASE_GIMPLE_OMP): Changed GIMPLE_OMP_GPUKERNEL to GIMPLE_OMP_GRID_BODY. * omp-low.c (build_outer_var_ref): Changed GIMPLE_OMP_GPUKERNEL to GIMPLE_OMP_GRID_BODY. (scan_sharing_clauses): Changed OMP_CLAUSE_GRIDDIM_SIZE to OMP_CLAUSE__GRIDDIM__SIZE and OMP_CLAUSE_GRIDDIM_GROUP to OMP_CLAUSE__GRIDDIM__GROUP. (check_omp_nesting_restrictions): Changed GIMPLE_OMP_GPUKERNEL to GIMPLE_OMP_GRID_BODY. (scan_omp_1_stmt): Likewise. (region_needs_kernel_p): Renamed to parallel_needs_hsa_kernel_p. Use GIMPLE_CODE instead of is_a. (kernel_dim_array_type): Removed. (kernel_lattrs_dimnum_decl;): Likewise. (kernel_lattrs_grid_decl): Likewise. (kernel_lattrs_group_decl): Likewise. (kernel_launch_attributes_type): Likewise. (grid_launch_attributes_trees): New type. (grid_attr_trees): New variable. (create_kernel_launch_attr_types): Renamed to grid_create_kernel_launch_attr_types. Work on trees encapsulated in grid_attr_trees. (insert_store_range_dim): Renamed to grid_insert_store_range_dim. Work on trees encapsulated in grid_attr_trees. (get_kernel_launch_attributes): Renamed to grid_get_kernel_launch_attributes. Work on trees encapsulated in grid_attr_trees. (push_target_argument_according_to_value): New function. (get_target_arguments): Use it to encode num_teams and thread_limit depending on it being constant and its value. (expand_omp_for_kernel): Renamed to grid_expand_omp_for_loop. Changed GIMPLE_OMP_GPUKERNEL to GIMPLE_OMP_GRID_BODY. (arg_decl_map): Renamed to grid_arg_decl_map. (remap_kernel_arg_accesses): Renamed to grid_remap_kernel_arg_accesses. (expand_target_kernel_body): Renamed to grid_expand_target_grid_body. Changed GIMPLE_OMP_GPUKERNEL to GIMPLE_OMP_GRID_BODY. (lower_omp_taskreg): Use GIMPLE_CODE instead of is_a. (lower_omp_1): Changed GIMPLE_OMP_GPUKERNEL to GIMPLE_OMP_GRID_BODY. (reg_assignment_to_local_var_p): Renamed to grid_reg_assignment_to_local_var_p. (seq_only_contains_local_assignments): Renamed to gris_seq_only_contains_local_assignments. (find_single_omp_among_assignments_1): Renamed to grid_find_single_omp_among_assignments_1. (find_single_omp_among_assignments): Renamed to grid_find_single_omp_among_assignments. (find_ungridifiable_statement): Renamed to grid_find_ungridifiable_statement. (target_follows_gridifiable_pattern): Renamed to grid_target_follows_gridifiable_pattern. (process_kernel_body_copy): Renamed to grid_process_kernel_body_copy. (attempt_target_gridification): Renamed to grid_attempt_target_gridification. Changed OMP_CLAUSE_GRIDDIM_SIZE to OMP_CLAUSE__GRIDDIM__SIZE and OMP_CLAUSE_GRIDDIM_GROUP to OMP_CLAUSE__GRIDDIM__GROUP. (create_target_gpukernel_stmt): Renamed to grid_gridify_all_targets_stmt. (create_target_gpukernels): Renamed to grid_gridify_all_targets. (make_gimple_omp_edges): Changed GIMPLE_OMP_GPUKERNEL to GIMPLE_OMP_GRID_BODY. * tree-pretty-print.c (dump_omp_clause): Changed OMP_CLAUSE_GRIDDIM_SIZE to OMP_CLAUSE__GRIDDIM__SIZE and OMP_CLAUSE_GRIDDIM_GROUP to OMP_CLAUSE__GRIDDIM__GROUP. * tree.c (omp_clause_code_name): Changed griddim to to _griddim_. fortran/ * types.def: Removed a blank line. --- gcc/builtin-types.def | 1 - gcc/fortran/types.def | 1 - gcc/gimple-low.c | 2 +- gcc/gimple-pretty-print.c | 10 +- gcc/gimple-walk.c | 2 +- gcc/gimple.c | 8 +- gcc/gimple.def | 6 +- gcc/gimple.h | 70 +++++------ gcc/omp-low.c | 312 ++++++++++++++++++++++++++-------------------- gcc/tree-pretty-print.c | 4 +- gcc/tree.c | 2 +- gcc/tree.h | 4 +- 12 files changed, 224 insertions(+), 198 deletions(-) diff --git a/gcc/builtin-types.def b/gcc/builtin-types.def index 8dcf3a6..367a19a 100644 --- a/gcc/builtin-types.def +++ b/gcc/builtin-types.def @@ -557,7 +557,6 @@ DEF_FUNCTION_TYPE_9 (BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR_INT, BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR, BT_PTR_FN_VOID_PTR_PTR, BT_LONG, BT_LONG, BT_BOOL, BT_UINT, BT_PTR, BT_INT) - DEF_FUNCTION_TYPE_9 (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_PTR, BT_VOID, BT_INT, BT_PTR_FN_VOID_PTR, BT_SIZE, BT_PTR, BT_PTR, BT_PTR, BT_UINT, BT_PTR, BT_PTR) diff --git a/gcc/fortran/types.def b/gcc/fortran/types.def index 283eaf4..0f55885 100644 --- a/gcc/fortran/types.def +++ b/gcc/fortran/types.def @@ -222,7 +222,6 @@ DEF_FUNCTION_TYPE_9 (BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR_INT, BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR, BT_PTR_FN_VOID_PTR_PTR, BT_LONG, BT_LONG, BT_BOOL, BT_UINT, BT_PTR, BT_INT) - DEF_FUNCTION_TYPE_9 (BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_PTR, BT_VOID, BT_INT, BT_PTR_FN_VOID_PTR, BT_SIZE, BT_PTR, BT_PTR, BT_PTR, BT_UINT, BT_PTR, BT_PTR) diff --git a/gcc/gimple-low.c b/gcc/gimple-low.c index d2a6a80..5f361c5 100644 --- a/gcc/gimple-low.c +++ b/gcc/gimple-low.c @@ -358,7 +358,7 @@ lower_stmt (gimple_stmt_iterator *gsi, struct lower_data *data) case GIMPLE_OMP_TASK: case GIMPLE_OMP_TARGET: case GIMPLE_OMP_TEAMS: - case GIMPLE_OMP_GPUKERNEL: + case GIMPLE_OMP_GRID_BODY: data->cannot_fallthru = false; lower_omp_directive (gsi, data); data->cannot_fallthru = false; diff --git a/gcc/gimple-pretty-print.c b/gcc/gimple-pretty-print.c index 7a0c540..24c25b5 100644 --- a/gcc/gimple-pretty-print.c +++ b/gcc/gimple-pretty-print.c @@ -1187,8 +1187,8 @@ dump_gimple_omp_for (pretty_printer *buffer, gomp_for *gs, int spc, int flags) case GF_OMP_FOR_KIND_CILKSIMD: pp_string (buffer, "#pragma simd"); break; - case GF_OMP_FOR_KIND_KERNEL_BODY: - pp_string (buffer, "#pragma omp for kernel"); + case GF_OMP_FOR_KIND_GRID_LOOP: + pp_string (buffer, "#pragma omp for grid_loop"); break; default: gcc_unreachable (); @@ -1497,8 +1497,8 @@ dump_gimple_omp_block (pretty_printer *buffer, gimple *gs, int spc, int flags) case GIMPLE_OMP_SECTION: pp_string (buffer, "#pragma omp section"); break; - case GIMPLE_OMP_GPUKERNEL: - pp_string (buffer, "#pragma omp gpukernel"); + case GIMPLE_OMP_GRID_BODY: + pp_string (buffer, "#pragma omp gridified body"); break; default: gcc_unreachable (); @@ -2282,7 +2282,7 @@ pp_gimple_stmt_1 (pretty_printer *buffer, gimple *gs, int spc, int flags) case GIMPLE_OMP_MASTER: case GIMPLE_OMP_TASKGROUP: case GIMPLE_OMP_SECTION: - case GIMPLE_OMP_GPUKERNEL: + case GIMPLE_OMP_GRID_BODY: dump_gimple_omp_block (buffer, gs, spc, flags); break; diff --git a/gcc/gimple-walk.c b/gcc/gimple-walk.c index 695592d..9bd049e 100644 --- a/gcc/gimple-walk.c +++ b/gcc/gimple-walk.c @@ -644,7 +644,7 @@ walk_gimple_stmt (gimple_stmt_iterator *gsi, walk_stmt_fn callback_stmt, case GIMPLE_OMP_SINGLE: case GIMPLE_OMP_TARGET: case GIMPLE_OMP_TEAMS: - case GIMPLE_OMP_GPUKERNEL: + case GIMPLE_OMP_GRID_BODY: ret = walk_gimple_seq_mod (gimple_omp_body_ptr (stmt), callback_stmt, callback_op, wi); if (ret) diff --git a/gcc/gimple.c b/gcc/gimple.c index 4a1a75a..c0284b0 100644 --- a/gcc/gimple.c +++ b/gcc/gimple.c @@ -954,14 +954,14 @@ gimple_build_omp_master (gimple_seq body) return p; } -/* Build a GIMPLE_OMP_GPUKERNEL statement. +/* Build a GIMPLE_OMP_GRID_BODY statement. BODY is the sequence of statements to be executed by the kernel. */ gimple * -gimple_build_omp_gpukernel (gimple_seq body) +gimple_build_omp_grid_body (gimple_seq body) { - gimple *p = gimple_alloc (GIMPLE_OMP_GPUKERNEL, 0); + gimple *p = gimple_alloc (GIMPLE_OMP_GRID_BODY, 0); if (body) gimple_omp_set_body (p, body); @@ -1818,7 +1818,7 @@ gimple_copy (gimple *stmt) case GIMPLE_OMP_SECTION: case GIMPLE_OMP_MASTER: case GIMPLE_OMP_TASKGROUP: - case GIMPLE_OMP_GPUKERNEL: + case GIMPLE_OMP_GRID_BODY: copy_omp_body: new_seq = gimple_seq_copy (gimple_omp_body (stmt)); gimple_omp_set_body (copy, new_seq); diff --git a/gcc/gimple.def b/gcc/gimple.def index 30f0111..94287a2 100644 --- a/gcc/gimple.def +++ b/gcc/gimple.def @@ -369,16 +369,16 @@ DEFGSCODE(GIMPLE_OMP_TARGET, "gimple_omp_target", GSS_OMP_PARALLEL_LAYOUT) /* GIMPLE_OMP_TEAMS <BODY, CLAUSES> represents #pragma omp teams BODY is the sequence of statements inside the single section. CLAUSES is an OMP_CLAUSE chain holding the associated clauses. */ -DEFGSCODE(GIMPLE_OMP_TEAMS, "gimple_omp_teams", GSS_OMP_TEAMS_LAYOUT) +DEFGSCODE(GIMPLE_OMP_TEAMS, "gimple_omp_teams", GSS_OMP_SINGLE_LAYOUT) /* GIMPLE_OMP_ORDERED <BODY, CLAUSES> represents #pragma omp ordered. BODY is the sequence of statements to execute in the ordered section. CLAUSES is an OMP_CLAUSE chain holding the associated clauses. */ DEFGSCODE(GIMPLE_OMP_ORDERED, "gimple_omp_ordered", GSS_OMP_SINGLE_LAYOUT) -/* GIMPLE_OMP_GPUKERNEL <BODY> represents a parallel loop lowered for execution +/* GIMPLE_OMP_GRID_BODY <BODY> represents a parallel loop lowered for execution on a GPU. It is an artificial statement created by omp lowering. */ -DEFGSCODE(GIMPLE_OMP_GPUKERNEL, "gimple_omp_gpukernel", GSS_OMP) +DEFGSCODE(GIMPLE_OMP_GRID_BODY, "gimple_omp_gpukernel", GSS_OMP) /* GIMPLE_PREDICT <PREDICT, OUTCOME> specifies a hint for branch prediction. diff --git a/gcc/gimple.h b/gcc/gimple.h index 2f203c1..ae9da2d 100644 --- a/gcc/gimple.h +++ b/gcc/gimple.h @@ -146,6 +146,7 @@ enum gf_mask { GF_CALL_CTRL_ALTERING = 1 << 7, GF_CALL_WITH_BOUNDS = 1 << 8, GF_OMP_PARALLEL_COMBINED = 1 << 0, + GF_OMP_PARALLEL_GRID_PHONY = 1 << 1, GF_OMP_TASK_TASKLOOP = 1 << 0, GF_OMP_FOR_KIND_MASK = (1 << 4) - 1, GF_OMP_FOR_KIND_FOR = 0, @@ -153,13 +154,14 @@ enum gf_mask { GF_OMP_FOR_KIND_TASKLOOP = 2, GF_OMP_FOR_KIND_CILKFOR = 3, GF_OMP_FOR_KIND_OACC_LOOP = 4, - GF_OMP_FOR_KIND_KERNEL_BODY = 5, + GF_OMP_FOR_KIND_GRID_LOOP = 5, /* Flag for SIMD variants of OMP_FOR kinds. */ GF_OMP_FOR_SIMD = 1 << 3, GF_OMP_FOR_KIND_SIMD = GF_OMP_FOR_SIMD | 0, GF_OMP_FOR_KIND_CILKSIMD = GF_OMP_FOR_SIMD | 1, GF_OMP_FOR_COMBINED = 1 << 4, GF_OMP_FOR_COMBINED_INTO = 1 << 5, + GF_OMP_FOR_GRID_PHONY = 1 << 6, GF_OMP_TARGET_KIND_MASK = (1 << 4) - 1, GF_OMP_TARGET_KIND_REGION = 0, GF_OMP_TARGET_KIND_DATA = 1, @@ -173,6 +175,7 @@ enum gf_mask { GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA = 9, GF_OMP_TARGET_KIND_OACC_DECLARE = 10, GF_OMP_TARGET_KIND_OACC_HOST_DATA = 11, + GF_OMP_TEAMS_GRID_PHONY = 1 << 0, /* True on an GIMPLE_OMP_RETURN statement if the return does not require a thread synchronization via some sort of barrier. The exact barrier @@ -624,12 +627,6 @@ struct GTY((tag("GSS_OMP_FOR"))) /* [ WORD 11 ] Pre-body evaluated before the loop body begins. */ gimple_seq pre_body; - - /* [ WORD 12 ] - If set, this statement is part of a gridified kernel, its clauses need to - be scanned and lowered but the statement should be discarded after - lowering. */ - bool kernel_phony; }; @@ -651,12 +648,6 @@ struct GTY((tag("GSS_OMP_PARALLEL_LAYOUT"))) /* [ WORD 10 ] Shared data argument. */ tree data_arg; - - /* [ WORD 11 ] */ - /* If set, this statement is part of a gridified kernel, its clauses need to - be scanned and lowered but the statement should be discarded after - lowering. */ - bool kernel_phony; }; /* GIMPLE_OMP_PARALLEL or GIMPLE_TASK */ @@ -757,18 +748,11 @@ struct GTY((tag("GSS_OMP_SINGLE_LAYOUT"))) stmt->code == GIMPLE_OMP_SINGLE. */ }; -/* GIMPLE_OMP_TEAMS */ - -struct GTY((tag("GSS_OMP_TEAMS_LAYOUT"))) +struct GTY((tag("GSS_OMP_SINGLE_LAYOUT"))) gomp_teams : public gimple_statement_omp_single_layout { - /* [ WORD 1-8 ] : base class */ - - /* [ WORD 9 ] - If set, this statement is part of a gridified kernel, its clauses need to - be scanned and lowered but the statement should be discarded after - lowering. */ - bool kernel_phony; + /* No extra fields; adds invariant: + stmt->code == GIMPLE_OMP_TEAMS. */ }; struct GTY((tag("GSS_OMP_SINGLE_LAYOUT"))) @@ -1472,7 +1456,7 @@ gomp_task *gimple_build_omp_task (gimple_seq, tree, tree, tree, tree, tree, tree); gimple *gimple_build_omp_section (gimple_seq); gimple *gimple_build_omp_master (gimple_seq); -gimple *gimple_build_omp_gpukernel (gimple_seq); +gimple *gimple_build_omp_grid_body (gimple_seq); gimple *gimple_build_omp_taskgroup (gimple_seq); gomp_continue *gimple_build_omp_continue (tree, tree); gomp_ordered *gimple_build_omp_ordered (gimple_seq, tree); @@ -1733,7 +1717,7 @@ gimple_has_substatements (gimple *g) case GIMPLE_OMP_CRITICAL: case GIMPLE_WITH_CLEANUP_EXPR: case GIMPLE_TRANSACTION: - case GIMPLE_OMP_GPUKERNEL: + case GIMPLE_OMP_GRID_BODY: return true; default: @@ -5102,17 +5086,20 @@ gimple_omp_for_set_pre_body (gimple *gs, gimple_seq pre_body) /* Return the kernel_phony of OMP_FOR statement. */ static inline bool -gimple_omp_for_kernel_phony (const gomp_for *omp_for) +gimple_omp_for_grid_phony (const gomp_for *omp_for) { - return omp_for->kernel_phony; + return (gimple_omp_subcode (omp_for) & GF_OMP_FOR_GRID_PHONY) != 0; } /* Set kernel_phony flag of OMP_FOR to VALUE. */ static inline void -gimple_omp_for_set_kernel_phony (gomp_for *omp_for, bool value) +gimple_omp_for_set_grid_phony (gomp_for *omp_for, bool value) { - omp_for->kernel_phony = value; + if (value) + omp_for->subcode |= GF_OMP_FOR_GRID_PHONY; + else + omp_for->subcode &= ~GF_OMP_FOR_GRID_PHONY; } /* Return the clauses associated with OMP_PARALLEL GS. */ @@ -5203,18 +5190,20 @@ gimple_omp_parallel_set_data_arg (gomp_parallel *omp_parallel_stmt, /* Return the kernel_phony flag of OMP_PARALLEL_STMT. */ static inline bool -gimple_omp_parallel_kernel_phony (const gomp_parallel *omp_parallel_stmt) +gimple_omp_parallel_grid_phony (const gomp_parallel *stmt) { - return omp_parallel_stmt->kernel_phony; + return (gimple_omp_subcode (stmt) & GF_OMP_PARALLEL_GRID_PHONY) != 0; } /* Set kernel_phony flag of OMP_PARALLEL_STMT to VALUE. */ static inline void -gimple_omp_parallel_set_kernel_phony (gomp_parallel *omp_parallel_stmt, - bool value) +gimple_omp_parallel_set_grid_phony (gomp_parallel *stmt, bool value) { - omp_parallel_stmt->kernel_phony = value; + if (value) + stmt->subcode |= GF_OMP_PARALLEL_GRID_PHONY; + else + stmt->subcode &= ~GF_OMP_PARALLEL_GRID_PHONY; } /* Return the clauses associated with OMP_TASK GS. */ @@ -5692,17 +5681,20 @@ gimple_omp_teams_set_clauses (gomp_teams *omp_teams_stmt, tree clauses) /* Return the kernel_phony flag of an OMP_TEAMS_STMT. */ static inline bool -gimple_omp_teams_kernel_phony (const gomp_teams *omp_teams_stmt) +gimple_omp_teams_grid_phony (const gomp_teams *omp_teams_stmt) { - return omp_teams_stmt->kernel_phony; + return (gimple_omp_subcode (omp_teams_stmt) & GF_OMP_TEAMS_GRID_PHONY) != 0; } /* Set kernel_phony flag of an OMP_TEAMS_STMT to VALUE. */ static inline void -gimple_omp_teams_set_kernel_phony (gomp_teams *omp_teams_stmt, bool value) +gimple_omp_teams_set_grid_phony (gomp_teams *omp_teams_stmt, bool value) { - omp_teams_stmt->kernel_phony = value; + if (value) + omp_teams_stmt->subcode |= GF_OMP_TEAMS_GRID_PHONY; + else + omp_teams_stmt->subcode &= ~GF_OMP_TEAMS_GRID_PHONY; } /* Return the clauses associated with OMP_SECTIONS GS. */ @@ -6034,7 +6026,7 @@ gimple_return_set_retbnd (gimple *gs, tree retval) case GIMPLE_OMP_ATOMIC_LOAD: \ case GIMPLE_OMP_ATOMIC_STORE: \ case GIMPLE_OMP_CONTINUE: \ - case GIMPLE_OMP_GPUKERNEL + case GIMPLE_OMP_GRID_BODY static inline bool is_gimple_omp (const gimple *stmt) diff --git a/gcc/omp-low.c b/gcc/omp-low.c index e4ce273..0f6f149 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -1339,11 +1339,11 @@ build_outer_var_ref (tree var, omp_context *ctx, bool lastprivate = false) else if (ctx->outer) { omp_context *outer = ctx->outer; - if (gimple_code (outer->stmt) == GIMPLE_OMP_GPUKERNEL) + if (gimple_code (outer->stmt) == GIMPLE_OMP_GRID_BODY) { outer = outer->outer; gcc_assert (outer - && gimple_code (outer->stmt) != GIMPLE_OMP_GPUKERNEL); + && gimple_code (outer->stmt) != GIMPLE_OMP_GRID_BODY); } x = lookup_decl (var, outer); } @@ -2160,8 +2160,8 @@ scan_sharing_clauses (tree clauses, omp_context *ctx, case OMP_CLAUSE__GRIDDIM_: if (ctx->outer) { - scan_omp_op (&OMP_CLAUSE_GRIDDIM_SIZE (c), ctx->outer); - scan_omp_op (&OMP_CLAUSE_GRIDDIM_GROUP (c), ctx->outer); + scan_omp_op (&OMP_CLAUSE__GRIDDIM__SIZE (c), ctx->outer); + scan_omp_op (&OMP_CLAUSE__GRIDDIM__GROUP (c), ctx->outer); } break; @@ -2683,7 +2683,7 @@ scan_omp_parallel (gimple_stmt_iterator *gsi, omp_context *outer_ctx) DECL_NAMELESS (name) = 1; TYPE_NAME (ctx->record_type) = name; TYPE_ARTIFICIAL (ctx->record_type) = 1; - if (!gimple_omp_parallel_kernel_phony (stmt)) + if (!gimple_omp_parallel_grid_phony (stmt)) { create_omp_child_function (ctx, false); gimple_omp_parallel_set_child_fn (stmt, ctx->cb.dst_fn); @@ -3227,7 +3227,7 @@ check_omp_nesting_restrictions (gimple *stmt, omp_context *ctx) { tree c; - if (ctx && gimple_code (ctx->stmt) == GIMPLE_OMP_GPUKERNEL) + if (ctx && gimple_code (ctx->stmt) == GIMPLE_OMP_GRID_BODY) /* GPUKERNEL is an artificial construct, nesting rules will be checked in the original copy of its contents. */ return true; @@ -3958,7 +3958,7 @@ scan_omp_1_stmt (gimple_stmt_iterator *gsi, bool *handled_ops_p, case GIMPLE_OMP_TASKGROUP: case GIMPLE_OMP_ORDERED: case GIMPLE_OMP_CRITICAL: - case GIMPLE_OMP_GPUKERNEL: + case GIMPLE_OMP_GRID_BODY: ctx = new_omp_context (stmt, ctx); scan_omp (gimple_omp_body_ptr (stmt), ctx); break; @@ -6392,10 +6392,10 @@ gimple_build_cond_empty (tree cond) } /* Return true if a parallel REGION is within a declare target function or - within a target region and is not a part of a gridified kernel. */ + within a target region and is not a part of a gridified target. */ static bool -region_needs_kernel_p (struct omp_region *region) +parallel_needs_hsa_kernel_p (struct omp_region *region) { bool indirect = false; for (region = region->outer; region; region = region->outer) @@ -6404,8 +6404,8 @@ region_needs_kernel_p (struct omp_region *region) indirect = true; else if (region->type == GIMPLE_OMP_TARGET) { - gomp_target *tgt_stmt; - tgt_stmt = as_a <gomp_target *> (last_stmt (region->entry)); + gomp_target *tgt_stmt + = as_a <gomp_target *> (last_stmt (region->entry)); if (find_omp_clause (gimple_omp_target_clauses (tgt_stmt), OMP_CLAUSE__GRIDDIM_)) @@ -6609,7 +6609,7 @@ expand_parallel_call (struct omp_region *region, basic_block bb, false, GSI_CONTINUE_LINKING); if (hsa_gen_requested_p () - && region_needs_kernel_p (region)) + && parallel_needs_hsa_kernel_p (region)) { cgraph_node *child_cnode = cgraph_node::get (child_fndecl); hsa_register_kernel (child_cnode); @@ -12655,42 +12655,50 @@ mark_loops_in_oacc_kernels_region (basic_block region_entry, /* Types used to pass grid and wortkgroup sizes to kernel invocation. */ -static GTY(()) tree kernel_dim_array_type; -static GTY(()) tree kernel_lattrs_dimnum_decl; -static GTY(()) tree kernel_lattrs_grid_decl; -static GTY(()) tree kernel_lattrs_group_decl; -static GTY(()) tree kernel_launch_attributes_type; +struct GTY(()) grid_launch_attributes_trees +{ + tree kernel_dim_array_type; + tree kernel_lattrs_dimnum_decl; + tree kernel_lattrs_grid_decl; + tree kernel_lattrs_group_decl; + tree kernel_launch_attributes_type; +}; + +static GTY(()) struct grid_launch_attributes_trees *grid_attr_trees; /* Create types used to pass kernel launch attributes to target. */ static void -create_kernel_launch_attr_types (void) +grid_create_kernel_launch_attr_types (void) { - if (kernel_launch_attributes_type) + if (grid_attr_trees) return; - - tree dim_arr_index_type; - dim_arr_index_type = build_index_type (build_int_cst (integer_type_node, 2)); - kernel_dim_array_type = build_array_type (uint32_type_node, - dim_arr_index_type); - - kernel_launch_attributes_type = make_node (RECORD_TYPE); - kernel_lattrs_dimnum_decl = build_decl (BUILTINS_LOCATION, FIELD_DECL, - get_identifier ("ndim"), - uint32_type_node); - DECL_CHAIN (kernel_lattrs_dimnum_decl) = NULL_TREE; - - kernel_lattrs_grid_decl = build_decl (BUILTINS_LOCATION, FIELD_DECL, - get_identifier ("grid_size"), - kernel_dim_array_type); - DECL_CHAIN (kernel_lattrs_grid_decl) = kernel_lattrs_dimnum_decl; - kernel_lattrs_group_decl = build_decl (BUILTINS_LOCATION, FIELD_DECL, - get_identifier ("group_size"), - kernel_dim_array_type); - DECL_CHAIN (kernel_lattrs_group_decl) = kernel_lattrs_grid_decl; - finish_builtin_struct (kernel_launch_attributes_type, + grid_attr_trees = ggc_alloc <grid_launch_attributes_trees> (); + + tree dim_arr_index_type + = build_index_type (build_int_cst (integer_type_node, 2)); + grid_attr_trees->kernel_dim_array_type + = build_array_type (uint32_type_node, dim_arr_index_type); + + grid_attr_trees->kernel_launch_attributes_type = make_node (RECORD_TYPE); + grid_attr_trees->kernel_lattrs_dimnum_decl + = build_decl (BUILTINS_LOCATION, FIELD_DECL, get_identifier ("ndim"), + uint32_type_node); + DECL_CHAIN (grid_attr_trees->kernel_lattrs_dimnum_decl) = NULL_TREE; + + grid_attr_trees->kernel_lattrs_grid_decl + = build_decl (BUILTINS_LOCATION, FIELD_DECL, get_identifier ("grid_size"), + grid_attr_trees->kernel_dim_array_type); + DECL_CHAIN (grid_attr_trees->kernel_lattrs_grid_decl) + = grid_attr_trees->kernel_lattrs_dimnum_decl; + grid_attr_trees->kernel_lattrs_group_decl + = build_decl (BUILTINS_LOCATION, FIELD_DECL, get_identifier ("group_size"), + grid_attr_trees->kernel_dim_array_type); + DECL_CHAIN (grid_attr_trees->kernel_lattrs_group_decl) + = grid_attr_trees->kernel_lattrs_grid_decl; + finish_builtin_struct (grid_attr_trees->kernel_launch_attributes_type, "__gomp_kernel_launch_attributes", - kernel_lattrs_group_decl, NULL_TREE); + grid_attr_trees->kernel_lattrs_group_decl, NULL_TREE); } /* Insert before the current statement in GSI a store of VALUE to INDEX of @@ -12698,11 +12706,12 @@ create_kernel_launch_attr_types (void) of type uint32_type_node. */ static void -insert_store_range_dim (gimple_stmt_iterator *gsi, tree range_var, - tree fld_decl, int index, tree value) +grid_insert_store_range_dim (gimple_stmt_iterator *gsi, tree range_var, + tree fld_decl, int index, tree value) { tree ref = build4 (ARRAY_REF, uint32_type_node, - build3 (COMPONENT_REF, kernel_dim_array_type, + build3 (COMPONENT_REF, + grid_attr_trees->kernel_dim_array_type, range_var, fld_decl, NULL_TREE), build_int_cst (integer_type_node, index), NULL_TREE, NULL_TREE); @@ -12715,11 +12724,12 @@ insert_store_range_dim (gimple_stmt_iterator *gsi, tree range_var, necessary information in it. */ static tree -get_kernel_launch_attributes (gimple_stmt_iterator *gsi, gomp_target *tgt_stmt) +grid_get_kernel_launch_attributes (gimple_stmt_iterator *gsi, + gomp_target *tgt_stmt) { - create_kernel_launch_attr_types (); + grid_create_kernel_launch_attr_types (); tree u32_one = build_one_cst (uint32_type_node); - tree lattrs = create_tmp_var (kernel_launch_attributes_type, + tree lattrs = create_tmp_var (grid_attr_trees->kernel_launch_attributes_type, "__kernel_launch_attrs"); unsigned max_dim = 0; @@ -12733,14 +12743,16 @@ get_kernel_launch_attributes (gimple_stmt_iterator *gsi, gomp_target *tgt_stmt) unsigned dim = OMP_CLAUSE_GRIDDIM_DIMENSION (clause); max_dim = MAX (dim, max_dim); - insert_store_range_dim (gsi, lattrs, kernel_lattrs_grid_decl, dim, - OMP_CLAUSE_GRIDDIM_SIZE (clause)); - insert_store_range_dim (gsi, lattrs, kernel_lattrs_group_decl, dim, - OMP_CLAUSE_GRIDDIM_GROUP (clause)); + grid_insert_store_range_dim (gsi, lattrs, + grid_attr_trees->kernel_lattrs_grid_decl, + dim, OMP_CLAUSE__GRIDDIM__SIZE (clause)); + grid_insert_store_range_dim (gsi, lattrs, + grid_attr_trees->kernel_lattrs_group_decl, + dim, OMP_CLAUSE__GRIDDIM__GROUP (clause)); } - tree dimref = build3 (COMPONENT_REF, uint32_type_node, - lattrs, kernel_lattrs_dimnum_decl, NULL_TREE); + tree dimref = build3 (COMPONENT_REF, uint32_type_node, lattrs, + grid_attr_trees->kernel_lattrs_dimnum_decl, NULL_TREE); /* At this moment we cannot gridify a loop with a collapse clause. */ /* TODO: Adjust when we support bigger collapse. */ gcc_assert (max_dim == 0); @@ -12792,30 +12804,52 @@ get_target_argument_value (gimple_stmt_iterator *gsi, int device, int id, return force_gimple_operand_gsi (gsi, t, true, NULL, true, GSI_SAME_STMT); } +/* If VALUE is an integer constant greater than -2^15 and smaller than 2^15, + push one argument to ARGS with bot the DEVICE, ID and VALUE embeded in it, + otherwise push an iedntifier (with DEVICE and ID) and the VALUE in two + arguments. */ + +static void +push_target_argument_according_to_value (gimple_stmt_iterator *gsi, int device, + int id, tree value, vec <tree> *args) +{ + if (tree_fits_shwi_p (value) + && tree_to_shwi (value) > -(1 << 15) + && tree_to_shwi (value) < (1 << 15)) + args->quick_push (get_target_argument_value (gsi, device, id, value)); + else + { + args->quick_push (get_target_argument_identifier (device, true, id)); + value = fold_convert (ptr_type_node, value); + value = force_gimple_operand_gsi (gsi, value, true, NULL, true, + GSI_SAME_STMT); + args->quick_push (value); + } +} + /* Create an array of arguments that is then passed to GOMP_target. */ static tree get_target_arguments (gimple_stmt_iterator *gsi, gomp_target *tgt_stmt) { - auto_vec <tree, 4> args; + auto_vec <tree, 6> args; tree clauses = gimple_omp_target_clauses (tgt_stmt); tree t, c = find_omp_clause (clauses, OMP_CLAUSE_NUM_TEAMS); if (c) t = OMP_CLAUSE_NUM_TEAMS_EXPR (c); else t = integer_minus_one_node; - t = get_target_argument_value (gsi, GOMP_TARGET_ARG_DEVICE_ALL, - GOMP_TARGET_ARG_NUM_TEAMS, t); - args.quick_push (t); + push_target_argument_according_to_value (gsi, GOMP_TARGET_ARG_DEVICE_ALL, + GOMP_TARGET_ARG_NUM_TEAMS, t, &args); c = find_omp_clause (clauses, OMP_CLAUSE_THREAD_LIMIT); if (c) t = OMP_CLAUSE_THREAD_LIMIT_EXPR (c); else t = integer_minus_one_node; - t = get_target_argument_value (gsi, GOMP_TARGET_ARG_DEVICE_ALL, - GOMP_TARGET_ARG_THREAD_LIMIT, t); - args.quick_push (t); + push_target_argument_according_to_value (gsi, GOMP_TARGET_ARG_DEVICE_ALL, + GOMP_TARGET_ARG_THREAD_LIMIT, t, + &args); /* Add HSA-specific grid sizes, if available. */ if (find_omp_clause (gimple_omp_target_clauses (tgt_stmt), @@ -12824,7 +12858,7 @@ get_target_arguments (gimple_stmt_iterator *gsi, gomp_target *tgt_stmt) t = get_target_argument_identifier (GOMP_DEVICE_HSA, true, GOMP_TARGET_ARG_HSA_KERNEL_ATTRIBUTES); args.quick_push (t); - args.quick_push (get_kernel_launch_attributes (gsi, tgt_stmt)); + args.quick_push (grid_get_kernel_launch_attributes (gsi, tgt_stmt)); } /* Produce more, perhaps device specific, arguments here. */ @@ -13374,7 +13408,7 @@ expand_omp_target (struct omp_region *region) variable derived from the thread number. */ static void -expand_omp_for_kernel (struct omp_region *kfor) +grid_expand_omp_for_loop (struct omp_region *kfor) { tree t, threadid; tree type, itype; @@ -13384,7 +13418,7 @@ expand_omp_for_kernel (struct omp_region *kfor) gomp_for *for_stmt = as_a <gomp_for *> (last_stmt (kfor->entry)); gcc_checking_assert (gimple_omp_for_kind (for_stmt) - == GF_OMP_FOR_KIND_KERNEL_BODY); + == GF_OMP_FOR_KIND_GRID_LOOP); basic_block body_bb = FALLTHRU_EDGE (kfor->entry)->dest; gcc_assert (gimple_omp_for_collapse (for_stmt) == 1); @@ -13447,10 +13481,10 @@ expand_omp_for_kernel (struct omp_region *kfor) set_immediate_dominator (CDI_DOMINATORS, kfor->exit, kfor->cont); } -/* Structure passed to remap_kernel_arg_accesses so that it can remap +/* Structure passed to grid_remap_kernel_arg_accesses so that it can remap argument_decls. */ -struct arg_decl_map +struct grid_arg_decl_map { tree old_arg; tree new_arg; @@ -13460,10 +13494,10 @@ struct arg_decl_map pertaining to kernel function. */ static tree -remap_kernel_arg_accesses (tree *tp, int *walk_subtrees, void *data) +grid_remap_kernel_arg_accesses (tree *tp, int *walk_subtrees, void *data) { struct walk_stmt_info *wi = (struct walk_stmt_info *) data; - struct arg_decl_map *adm = (struct arg_decl_map *) wi->info; + struct grid_arg_decl_map *adm = (struct grid_arg_decl_map *) wi->info; tree t = *tp; if (t == adm->old_arg) @@ -13478,7 +13512,7 @@ static void expand_omp (struct omp_region *region); TARGET and expand it in GPGPU kernel fashion. */ static void -expand_target_kernel_body (struct omp_region *target) +grid_expand_target_grid_body (struct omp_region *target) { if (!hsa_gen_requested_p ()) return; @@ -13487,7 +13521,7 @@ expand_target_kernel_body (struct omp_region *target) struct omp_region **pp; for (pp = &target->inner; *pp; pp = &(*pp)->next) - if ((*pp)->type == GIMPLE_OMP_GPUKERNEL) + if ((*pp)->type == GIMPLE_OMP_GRID_BODY) break; struct omp_region *gpukernel = *pp; @@ -13518,7 +13552,7 @@ expand_target_kernel_body (struct omp_region *target) struct omp_region *kfor = *pp; gcc_assert (kfor); gcc_assert (gimple_omp_for_kind (last_stmt ((kfor)->entry)) - == GF_OMP_FOR_KIND_KERNEL_BODY); + == GF_OMP_FOR_KIND_GRID_LOOP); *pp = kfor->next; if (kfor->inner) expand_omp (kfor->inner); @@ -13547,7 +13581,7 @@ expand_target_kernel_body (struct omp_region *target) kern_cfun->curr_properties = cfun->curr_properties; remove_edge (BRANCH_EDGE (kfor->entry)); - expand_omp_for_kernel (kfor); + grid_expand_omp_for_loop (kfor); /* Remove the omp for statement */ gimple_stmt_iterator gsi = gsi_last_bb (gpukernel->entry); @@ -13602,7 +13636,7 @@ expand_target_kernel_body (struct omp_region *target) TODO: It would be great if lowering produced references into the GPU kernel decl straight away and we did not have to do this. */ - struct arg_decl_map adm; + struct grid_arg_decl_map adm; adm.old_arg = old_parm_decl; adm.new_arg = new_parm_decl; basic_block bb; @@ -13614,7 +13648,7 @@ expand_target_kernel_body (struct omp_region *target) struct walk_stmt_info wi; memset (&wi, 0, sizeof (wi)); wi.info = &adm; - walk_gimple_op (stmt, remap_kernel_arg_accesses, &wi); + walk_gimple_op (stmt, grid_remap_kernel_arg_accesses, &wi); } } pop_cfun (); @@ -13642,7 +13676,7 @@ expand_omp (struct omp_region *region) if (region->type == GIMPLE_OMP_PARALLEL) determine_parallel_type (region); else if (region->type == GIMPLE_OMP_TARGET) - expand_target_kernel_body (region); + grid_expand_target_grid_body (region); if (region->type == GIMPLE_OMP_FOR && gimple_omp_for_combined_p (last_stmt (region->entry))) @@ -15021,11 +15055,11 @@ lower_omp_for (gimple_stmt_iterator *gsi_p, omp_context *ctx) ctx); } - if (!gimple_omp_for_kernel_phony (stmt)) + if (!gimple_omp_for_grid_phony (stmt)) gimple_seq_add_stmt (&body, stmt); gimple_seq_add_seq (&body, gimple_omp_body (stmt)); - if (!gimple_omp_for_kernel_phony (stmt)) + if (!gimple_omp_for_grid_phony (stmt)) gimple_seq_add_stmt (&body, gimple_build_omp_continue (fd.loop.v, fd.loop.v)); @@ -15039,7 +15073,7 @@ lower_omp_for (gimple_stmt_iterator *gsi_p, omp_context *ctx) body = maybe_catch_exception (body); - if (!gimple_omp_for_kernel_phony (stmt)) + if (!gimple_omp_for_grid_phony (stmt)) { /* Region exit marker goes at the end of the loop body. */ gimple_seq_add_stmt (&body, gimple_build_omp_return (fd.have_nowait)); @@ -15487,8 +15521,8 @@ lower_omp_taskreg (gimple_stmt_iterator *gsi_p, omp_context *ctx) par_olist = NULL; par_ilist = NULL; par_rlist = NULL; - bool phony_construct = is_a <gomp_parallel *> (stmt) - && gimple_omp_parallel_kernel_phony (as_a <gomp_parallel *> (stmt)); + bool phony_construct = gimple_code (stmt) == GIMPLE_OMP_PARALLEL + && gimple_omp_parallel_grid_phony (as_a <gomp_parallel *> (stmt)); if (phony_construct && ctx->record_type) { gcc_checking_assert (!ctx->receiver_decl); @@ -16703,7 +16737,7 @@ lower_omp_teams (gimple_stmt_iterator *gsi_p, omp_context *ctx) &bind_body, &dlist, ctx, NULL); lower_omp (gimple_omp_body_ptr (teams_stmt), ctx); lower_reduction_clauses (gimple_omp_teams_clauses (teams_stmt), &olist, ctx); - if (!gimple_omp_teams_kernel_phony (teams_stmt)) + if (!gimple_omp_teams_grid_phony (teams_stmt)) { gimple_seq_add_stmt (&bind_body, teams_stmt); location_t loc = gimple_location (teams_stmt); @@ -16717,7 +16751,7 @@ lower_omp_teams (gimple_stmt_iterator *gsi_p, omp_context *ctx) gimple_omp_set_body (teams_stmt, NULL); gimple_seq_add_seq (&bind_body, olist); gimple_seq_add_seq (&bind_body, dlist); - if (!gimple_omp_teams_kernel_phony (teams_stmt)) + if (!gimple_omp_teams_grid_phony (teams_stmt)) gimple_seq_add_stmt (&bind_body, gimple_build_omp_return (true)); gimple_bind_set_body (bind, bind_body); @@ -16951,7 +16985,7 @@ lower_omp_1 (gimple_stmt_iterator *gsi_p, omp_context *ctx) gcc_assert (ctx); lower_omp_teams (gsi_p, ctx); break; - case GIMPLE_OMP_GPUKERNEL: + case GIMPLE_OMP_GRID_BODY: ctx = maybe_lookup_ctx (stmt); gcc_assert (ctx); lower_omp_gpukernel (gsi_p, ctx); @@ -17050,7 +17084,7 @@ lower_omp (gimple_seq *body, omp_context *ctx) VAR_DECL. */ static bool -reg_assignment_to_local_var_p (gimple *stmt) +grid_reg_assignment_to_local_var_p (gimple *stmt) { gassign *assign = dyn_cast <gassign *> (stmt); if (!assign) @@ -17067,27 +17101,26 @@ reg_assignment_to_local_var_p (gimple *stmt) variables. */ static bool -seq_only_contains_local_assignments (gimple_seq seq) +grid_seq_only_contains_local_assignments (gimple_seq seq) { if (!seq) return true; gimple_stmt_iterator gsi; for (gsi = gsi_start (seq); !gsi_end_p (gsi); gsi_next (&gsi)) - if (!reg_assignment_to_local_var_p (gsi_stmt (gsi))) + if (!grid_reg_assignment_to_local_var_p (gsi_stmt (gsi))) return false; return true; } - /* Scan statements in SEQ and call itself recursively on any bind. If during whole search only assignments to register-type local variables and one single OMP statement is encountered, return true, otherwise return false. - 8RET is where we store any OMP statement encountered. TARGET_LOC and NAME + RET is where we store any OMP statement encountered. TARGET_LOC and NAME are used for dumping a note about a failure. */ static bool -find_single_omp_among_assignments_1 (gimple_seq seq, location_t target_loc, +grid_find_single_omp_among_assignments_1 (gimple_seq seq, location_t target_loc, const char *name, gimple **ret) { gimple_stmt_iterator gsi; @@ -17095,12 +17128,12 @@ find_single_omp_among_assignments_1 (gimple_seq seq, location_t target_loc, { gimple *stmt = gsi_stmt (gsi); - if (reg_assignment_to_local_var_p (stmt)) + if (grid_reg_assignment_to_local_var_p (stmt)) continue; if (gbind *bind = dyn_cast <gbind *> (stmt)) { - if (!find_single_omp_among_assignments_1 (gimple_bind_body (bind), - target_loc, name, ret)) + if (!grid_find_single_omp_among_assignments_1 (gimple_bind_body (bind), + target_loc, name, ret)) return false; } else if (is_gimple_omp (stmt)) @@ -17136,8 +17169,8 @@ find_single_omp_among_assignments_1 (gimple_seq seq, location_t target_loc, failure. */ static gimple * -find_single_omp_among_assignments (gimple_seq seq, location_t target_loc, - const char *name) +grid_find_single_omp_among_assignments (gimple_seq seq, location_t target_loc, + const char *name) { if (!seq) { @@ -17151,7 +17184,7 @@ find_single_omp_among_assignments (gimple_seq seq, location_t target_loc, } gimple *ret = NULL; - if (find_single_omp_among_assignments_1 (seq, target_loc, name, &ret)) + if (grid_find_single_omp_among_assignments_1 (seq, target_loc, name, &ret)) { if (!ret && dump_enabled_p ()) dump_printf_loc (MSG_NOTE, target_loc, @@ -17169,8 +17202,9 @@ find_single_omp_among_assignments (gimple_seq seq, location_t target_loc, function is found. */ static tree -find_ungridifiable_statement (gimple_stmt_iterator *gsi, bool *handled_ops_p, - struct walk_stmt_info *) +grid_find_ungridifiable_statement (gimple_stmt_iterator *gsi, + bool *handled_ops_p, + struct walk_stmt_info *) { *handled_ops_p = false; gimple *stmt = gsi_stmt (*gsi); @@ -17210,14 +17244,15 @@ find_ungridifiable_statement (gimple_stmt_iterator *gsi, bool *handled_ops_p, none. */ static bool -target_follows_gridifiable_pattern (gomp_target *target, tree *group_size_p) +grid_target_follows_gridifiable_pattern (gomp_target *target, tree *group_size_p) { if (gimple_omp_target_kind (target) != GF_OMP_TARGET_KIND_REGION) return false; location_t tloc = gimple_location (target); - gimple *stmt = find_single_omp_among_assignments (gimple_omp_body (target), - tloc, "target"); + gimple *stmt + = grid_find_single_omp_among_assignments (gimple_omp_body (target), + tloc, "target"); if (!stmt) return false; gomp_teams *teams = dyn_cast <gomp_teams *> (stmt); @@ -17263,8 +17298,8 @@ target_follows_gridifiable_pattern (gomp_target *target, tree *group_size_p) clauses = OMP_CLAUSE_CHAIN (clauses); } - stmt = find_single_omp_among_assignments (gimple_omp_body (teams), tloc, - "teams"); + stmt = grid_find_single_omp_among_assignments (gimple_omp_body (teams), tloc, + "teams"); if (!stmt) return false; gomp_for *dist = dyn_cast <gomp_for *> (stmt); @@ -17312,8 +17347,8 @@ target_follows_gridifiable_pattern (gomp_target *target, tree *group_size_p) } group_size = fd.chunk_size; } - stmt = find_single_omp_among_assignments (gimple_omp_body (dist), tloc, - "distribute"); + stmt = grid_find_single_omp_among_assignments (gimple_omp_body (dist), tloc, + "distribute"); gomp_parallel *par; if (!stmt || !(par = dyn_cast <gomp_parallel *> (stmt))) return false; @@ -17343,8 +17378,8 @@ target_follows_gridifiable_pattern (gomp_target *target, tree *group_size_p) clauses = OMP_CLAUSE_CHAIN (clauses); } - stmt = find_single_omp_among_assignments (gimple_omp_body (par), tloc, - "parallel"); + stmt = grid_find_single_omp_among_assignments (gimple_omp_body (par), tloc, + "parallel"); gomp_for *gfor; if (!stmt || !(gfor = dyn_cast <gomp_for *> (stmt))) return false; @@ -17368,7 +17403,7 @@ target_follows_gridifiable_pattern (gomp_target *target, tree *group_size_p) return false; } - if (!seq_only_contains_local_assignments (gimple_omp_for_pre_body (gfor))) + if (!grid_seq_only_contains_local_assignments (gimple_omp_for_pre_body (gfor))) { if (dump_enabled_p ()) dump_printf_loc (MSG_NOTE, tloc, @@ -17412,7 +17447,7 @@ target_follows_gridifiable_pattern (gomp_target *target, tree *group_size_p) struct walk_stmt_info wi; memset (&wi, 0, sizeof (wi)); if (gimple *bad = walk_gimple_seq (gimple_omp_body (gfor), - find_ungridifiable_statement, + grid_find_ungridifiable_statement, NULL, &wi)) { if (dump_enabled_p ()) @@ -17464,7 +17499,7 @@ remap_prebody_decls (tree *tp, int *walk_subtrees, void *data) before DST, Creating temporaries, adjusting mapping of operands in WI and remapping operands as necessary. Add any new temporaries to TGT_BIND. Return the first statement that does not conform to - reg_assignment_to_local_var_p or NULL. */ + grid_reg_assignment_to_local_var_p or NULL. */ static gimple * copy_leading_local_assignments (gimple_seq src, gimple_stmt_iterator *dst, @@ -17484,7 +17519,7 @@ copy_leading_local_assignments (gimple_seq src, gimple_stmt_iterator *dst, else continue; } - if (!reg_assignment_to_local_var_p (stmt)) + if (!grid_reg_assignment_to_local_var_p (stmt)) return stmt; tree lhs = gimple_assign_lhs (as_a <gassign *> (stmt)); tree repl = copy_var_decl (lhs, create_tmp_var_name (NULL), @@ -17506,13 +17541,13 @@ copy_leading_local_assignments (gimple_seq src, gimple_stmt_iterator *dst, adding new temporaries to TGT_BIND. */ static gomp_for * -process_kernel_body_copy (gimple_seq seq, gimple_stmt_iterator *dst, - gbind *tgt_bind, struct walk_stmt_info *wi) +grid_process_kernel_body_copy (gimple_seq seq, gimple_stmt_iterator *dst, + gbind *tgt_bind, struct walk_stmt_info *wi) { gimple *stmt = copy_leading_local_assignments (seq, dst, tgt_bind, wi); gomp_teams *teams = dyn_cast <gomp_teams *> (stmt); gcc_assert (teams); - gimple_omp_teams_set_kernel_phony (teams, true); + gimple_omp_teams_set_grid_phony (teams, true); stmt = copy_leading_local_assignments (gimple_omp_body (teams), dst, tgt_bind, wi); gcc_checking_assert (stmt); @@ -17521,17 +17556,17 @@ process_kernel_body_copy (gimple_seq seq, gimple_stmt_iterator *dst, gimple_seq prebody = gimple_omp_for_pre_body (dist); if (prebody) copy_leading_local_assignments (prebody, dst, tgt_bind, wi); - gimple_omp_for_set_kernel_phony (dist, true); + gimple_omp_for_set_grid_phony (dist, true); stmt = copy_leading_local_assignments (gimple_omp_body (dist), dst, tgt_bind, wi); gcc_checking_assert (stmt); gomp_parallel *parallel = as_a <gomp_parallel *> (stmt); - gimple_omp_parallel_set_kernel_phony (parallel, true); + gimple_omp_parallel_set_grid_phony (parallel, true); stmt = copy_leading_local_assignments (gimple_omp_body (parallel), dst, tgt_bind, wi); gomp_for *inner_loop = as_a <gomp_for *> (stmt); - gimple_omp_for_set_kind (inner_loop, GF_OMP_FOR_KIND_KERNEL_BODY); + gimple_omp_for_set_kind (inner_loop, GF_OMP_FOR_KIND_GRID_LOOP); prebody = gimple_omp_for_pre_body (inner_loop); if (prebody) copy_leading_local_assignments (prebody, dst, tgt_bind, wi); @@ -17545,11 +17580,12 @@ process_kernel_body_copy (gimple_seq seq, gimple_stmt_iterator *dst, added. */ static void -attempt_target_gridification (gomp_target *target, gimple_stmt_iterator *gsi, - gbind *tgt_bind) +grid_attempt_target_gridification (gomp_target *target, + gimple_stmt_iterator *gsi, + gbind *tgt_bind) { tree group_size; - if (!target || !target_follows_gridifiable_pattern (target, &group_size)) + if (!target || !grid_target_follows_gridifiable_pattern (target, &group_size)) return; location_t loc = gimple_location (target); @@ -17569,8 +17605,8 @@ attempt_target_gridification (gomp_target *target, gimple_stmt_iterator *gsi, /* Copy assignments in between OMP statements before target, mark OMP statements within copy appropriatly. */ - gomp_for *inner_loop = process_kernel_body_copy (kernel_seq, gsi, tgt_bind, - &wi); + gomp_for *inner_loop = grid_process_kernel_body_copy (kernel_seq, gsi, + tgt_bind, &wi); gbind *old_bind = as_a <gbind *> (gimple_seq_first (gimple_omp_body (target))); gbind *new_bind = as_a <gbind *> (gimple_seq_first (kernel_seq)); @@ -17579,7 +17615,7 @@ attempt_target_gridification (gomp_target *target, gimple_stmt_iterator *gsi, BLOCK_CHAIN (new_block) = BLOCK_SUBBLOCKS (enc_block); BLOCK_SUBBLOCKS (enc_block) = new_block; BLOCK_SUPERCONTEXT (new_block) = enc_block; - gimple *gpukernel = gimple_build_omp_gpukernel (kernel_seq); + gimple *gpukernel = gimple_build_omp_grid_body (kernel_seq); gimple_seq_add_stmt (gimple_bind_body_ptr (as_a <gbind *> (gimple_omp_body (target))), gpukernel); @@ -17636,8 +17672,8 @@ attempt_target_gridification (gomp_target *target, gimple_stmt_iterator *gsi, tree c = build_omp_clause (UNKNOWN_LOCATION, OMP_CLAUSE__GRIDDIM_); OMP_CLAUSE_SET_GRIDDIM_DIMENSION (c, (unsigned int) i); - OMP_CLAUSE_GRIDDIM_SIZE (c) = gs; - OMP_CLAUSE_GRIDDIM_GROUP (c) = ws; + OMP_CLAUSE__GRIDDIM__SIZE (c) = gs; + OMP_CLAUSE__GRIDDIM__GROUP (c) = ws; OMP_CLAUSE_CHAIN (c) = gimple_omp_target_clauses (target); gimple_omp_target_set_clauses (target, c); } @@ -17649,8 +17685,9 @@ attempt_target_gridification (gomp_target *target, gimple_stmt_iterator *gsi, /* Walker function doing all the work for create_target_kernels. */ static tree -create_target_gpukernel_stmt (gimple_stmt_iterator *gsi, bool *handled_ops_p, - struct walk_stmt_info *incoming) +grid_gridify_all_targets_stmt (gimple_stmt_iterator *gsi, + bool *handled_ops_p, + struct walk_stmt_info *incoming) { *handled_ops_p = false; @@ -17660,7 +17697,7 @@ create_target_gpukernel_stmt (gimple_stmt_iterator *gsi, bool *handled_ops_p, { gbind *tgt_bind = (gbind *) incoming->info; gcc_checking_assert (tgt_bind); - attempt_target_gridification (target, gsi, tgt_bind); + grid_attempt_target_gridification (target, gsi, tgt_bind); return NULL_TREE; } gbind *bind = dyn_cast <gbind *> (stmt); @@ -17671,25 +17708,24 @@ create_target_gpukernel_stmt (gimple_stmt_iterator *gsi, bool *handled_ops_p, memset (&wi, 0, sizeof (wi)); wi.info = bind; walk_gimple_seq_mod (gimple_bind_body_ptr (bind), - create_target_gpukernel_stmt, NULL, &wi); + grid_gridify_all_targets_stmt, NULL, &wi); } return NULL_TREE; } -/* Prepare all target constructs in BODY_P for GPU kernel generation, if they - follow a gridifiable pattern. All such targets will have their bodies - duplicated, with the new copy being put into a gpukernel. All - kernel-related construct within the gpukernel will be marked with phony - flags or kernel kinds. Moreover, some re-structuring is often needed, such - as copying pre-bodies before the target construct so that kernel grid sizes - can be computed. */ +/* Attempt to gridify all target constructs in BODY_P. All such targets will + have their bodies duplicated, with the new copy being put into a + gimple_omp_grid_body statement. All kernel-related construct within the + grid_body will be marked with phony flags or kernel kinds. Moreover, some + re-structuring is often needed, such as copying pre-bodies before the target + construct so that kernel grid sizes can be computed. */ static void -create_target_gpukernels (gimple_seq *body_p) +grid_gridify_all_targets (gimple_seq *body_p) { struct walk_stmt_info wi; memset (&wi, 0, sizeof (wi)); - walk_gimple_seq_mod (body_p, create_target_gpukernel_stmt, NULL, &wi); + walk_gimple_seq_mod (body_p, grid_gridify_all_targets_stmt, NULL, &wi); } @@ -17715,7 +17751,7 @@ execute_lower_omp (void) if (hsa_gen_requested_p () && PARAM_VALUE (PARAM_OMP_GPU_GRIDIFY) == 1) - create_target_gpukernels (&body); + grid_gridify_all_targets (&body); scan_omp (&body, NULL); gcc_assert (taskreg_nesting_level == 0); @@ -18054,7 +18090,7 @@ make_gimple_omp_edges (basic_block bb, struct omp_region **region, case GIMPLE_OMP_TASKGROUP: case GIMPLE_OMP_CRITICAL: case GIMPLE_OMP_SECTION: - case GIMPLE_OMP_GPUKERNEL: + case GIMPLE_OMP_GRID_BODY: cur_region = new_omp_region (bb, code, cur_region); fallthru = true; break; diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c index ad5cfdb..e250b9f 100644 --- a/gcc/tree-pretty-print.c +++ b/gcc/tree-pretty-print.c @@ -949,10 +949,10 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, int flags) pp_string (pp, "_griddim_("); pp_unsigned_wide_integer (pp, OMP_CLAUSE_GRIDDIM_DIMENSION (clause)); pp_colon (pp); - dump_generic_node (pp, OMP_CLAUSE_GRIDDIM_SIZE (clause), spc, flags, + dump_generic_node (pp, OMP_CLAUSE__GRIDDIM__SIZE (clause), spc, flags, false); pp_comma (pp); - dump_generic_node (pp, OMP_CLAUSE_GRIDDIM_GROUP (clause), spc, flags, + dump_generic_node (pp, OMP_CLAUSE__GRIDDIM__GROUP (clause), spc, flags, false); pp_right_paren (pp); break; diff --git a/gcc/tree.c b/gcc/tree.c index 94a36cb..f7fa25e 100644 --- a/gcc/tree.c +++ b/gcc/tree.c @@ -402,7 +402,7 @@ const char * const omp_clause_code_name[] = "num_workers", "vector_length", "tile", - "griddim" + "_griddim_" }; diff --git a/gcc/tree.h b/gcc/tree.h index dc16b84..0ee6723 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -1642,9 +1642,9 @@ extern void protected_set_expr_location (tree, location_t); #define OMP_CLAUSE_SET_GRIDDIM_DIMENSION(NODE, DIMENSION) \ (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE__GRIDDIM_)\ ->omp_clause.subcode.dimension = (DIMENSION)) -#define OMP_CLAUSE_GRIDDIM_SIZE(NODE) \ +#define OMP_CLAUSE__GRIDDIM__SIZE(NODE) \ OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE__GRIDDIM_), 0) -#define OMP_CLAUSE_GRIDDIM_GROUP(NODE) \ +#define OMP_CLAUSE__GRIDDIM__GROUP(NODE) \ OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE__GRIDDIM_), 1) /* SSA_NAME accessors. */ -- 2.6.3