Hi! This patch adds the processing of OpenACC declare directive in C and C++. (Note: Support in Fortran is already in trunk.) Commentary on the changes is included as an attachment (NOTES).
All of the code is in the gomp-4_0-branch. Regtested on x86_64-linux. Thanks! Jim
Background The declare directive is used to allocate device memory for the entire scope of a variable / array within a program, function, or subroutine. Consider the following example. int a[10]; #pragma acc declare create (a) void func (int *a) { int b[10]; #pragma acc declare create (b) #pragma acc parallel present (a, b) { int i; for (i = 0; i < 10; i++) { b[i] = a[i]; a[i] = b[i] + i; } } return; } int main (int argc, char **argv) { func (&a[0]); return 0; } In the example, array 'a' will be allocated on the device at the outset of device activity and be available for the duration. Whereas, array 'b', will only be available when 'func' is executing. In other words, array 'b' will be allocated at the outset of execution of 'func' and deallocated at the return from 'func'. In some instances, the clause may require that the host copy of a variable / array be updated prior to a return from a function or subroutine or exiting of the program. C and C++ front-ends Definitions for use in C and C++ were added to identify the declare directive pragma and its' valid clauses. After the clauses have been validated, if the declare directive is for a global variable, then an attribute is created and chained. These attributes will be used during gimplification. Once the user-specified clauses have been parsed, the clauses have to be examined and potentially altered and/or added to. As mentioned in the previous section, with some clauses, e.g., e.g, copy, movement of data has to occur at the entry to something like a function as well as at exit. Hence the need to examine/modify/add to the clauses so as to effect the correct data movement. For all instances of the declare directive, there is at least one set of 'entry' clauses. If the clauses pertain to global variables, a constructor is created. This constructor will 'register' the variable(s) / arrays so that at beginning of OpenACC runtime the variable / arrays will be allocated and be made available throughout program execution. If on the other hand, the 'entry' clauses are not found to be of a global type, then a node is created and the clauses are associated with it. Also note that the 'return' clauses are also associated with the node. Notice that there are 'return' clauses only for non-global variables / arrays. The clauses available for global variables / arrays only allow for data movement at the initiation of program execution. Middle-end The OACC_DECLARE node is handled much the same as other OpenACC nodes that represent directives. However, there is one thing unique to declare, and that is the handling of the 'return' clauses. The 'return' clauses are scanned and then a gimple statment is created, but is not added. However, it is saved to be added after the body has been gimplified. The intent of this last-minute addition is to allow this statement to be executed prior to returning from a function. JAKUB: While this has been working, I'm not completely sure this is the proper means by which to do this in order to guarantee this statement is the last one executed. Please advise otherwise. Callgraph The make_offload functionality has been refactored to handle OpenACC variables / arrays that are not external and and have 'oacc declare' attributes associated with them. Once 'offloaded', the attributes are removed. libgomp A function has been added to handle the 'registration' of global variables that will be allocated on the device. This function is called by the constructor which was generated as part of handling the declare directive by the compiler. An additional function has been added to allocate device memory from the list of registered variables / arrays. Another function has been added to deallocate that device. These functions are called at the appropriate places in the runtime. Finally, a function has been added to handle the declare builtin which is emitted by the compiler. Testing New compile and runtime tests have been added. (NOTE: The numbering for the runtime tests has a gap in it. These tests use both the declare and routine directive. The support for the routine directive has yet to be added to trunk, so these tests will appear once the support has been committed.)
2015-10-27 James Norris <jnor...@codesourcery.com> gcc/ * builtin-types.def (BT_FN_VOID_PTR_INT_UINT): New type. * c-family/c-common.c (c_common_attribute_table): New oacc_declare. * c-family/c-pragma.c (oacc_pragmas): Add entry for declare directive. * c-family/c-pragma.h (enum pragma_kind): Add PRAGMA_OACC_DECLARE. (enum pragma_omp_clause): Add PRAGMA_OACC_CLAUSE_DEVICE_RESIDENT. * c/c-parser.c (c_parser_pragma): Handle PRAGMA_OACC_DECLARE. (c_parser_omp_clause_name): Handle 'device_resident' clause. (c_parser_oacc_data_clause): Handle PRAGMA_OACC_CLAUSE_DEVICE_RESIDENT and PRAGMA_OMP_CLAUSE_LINK. (c_parser_oacc_all_clauses): Handle PRAGMA_OACC_CLAUSE_DEVICE_RESIDENT and PRAGMA_OACC_CLAUSE_LINK. (OACC_DECLARE_CLAUSE_MASK): New definition. (c_parser_oacc_declare): New function. * cp/parser.c (cp_parser_omp_clause_name): Handle 'device_resident' clause. (cp_parser_oacc_data_clause): Handle PRAGMA_OACC_CLAUSE_DEVICE_RESIDENT and PRAGMA_OMP_CLAUSE_LINK. (OACC_DECLARE_CLAUSE_MASK): New definition. (cp_parser_oacc_declare): New function. (cp_paser_oacc_all_clauses): Handle PRAGMA_OACC_CLAUSE_DEVICE_RESIDENT and PRAGMA_OMP_CLAUSE_LINK. (cp_parser_pragma): Handle PRAGMA_OACC_DECLARE. * cp/pt.c (tsubst_expr): Handle OACC_DECLARE. * fortran/types.def (BT_FN_VOID_PTR_INT_UINT): New type. * gimple.h (enum gf_mask): Add GF_OMP_TARGET_KIND_OACC_DECLARE. (is_gomple_omp_oacc): Handle GF_OMP_TARGET_KIND_OACC_DECLARE. * gimplify.c (struct gimplify_omp_ctx): New field. (new_omp_context): Initialize new field. (omp_default_clause): Handle device resident variable. (gimplify_oacc_declare): New function. (device_resident_p): New function. (gimplify_expr): Handle OACC_DECLARE. (gimplify_body): Handle updating of declare'd variables. * omp-builtins.def (BUILT_IN_GOACC_STATIC, BUILT_IN_GOACC_DECLARE): New builtins. * omp-low.c (expand_omp_target): Handle GF_OMP_TARGET_KIND_OACC_DECLARE and BUILTIN_GOACC_DECLARE. (lower_omp_target): Handle GF_OMP_TARGET_KIND_OACC_DECLARE, GOMP_MAP_DEVICE_RESIDENT and GOMP_MAP_LINK. (make_gimple_omp_edges): Handle GF_OMP_TARGET_KIND_OACC_DECLARE. * testsuite/c-c++-common/goacc/declare-1.c: New test. * testsuite/c-c++-common/goacc/declare-2.c: Likewise. * tree.def (OACC_DECLARE): Update operands. * tree.h (OACC_DECLARE_RETURN_CLAUSES): New definition. * varpool.c (make_offloadable_1, make_offloadable): New functions. (get_create): Refactor offload functionality. include/ * gomp-constants.h (enum gomp_map_kind): Add GOMP_MAP_DEVICE_RESIDENT and GOMP_MAP_LINK. libgomp/ * libgomp.map (GOACC_2.0): Export GOACC_declare. * oacc-init.c (acc_shutdown_1): Add call. * oacc-int.h (goacc_allocate_static, goacc_deallocate_static): New declarations. * oacc-parallel.c (struct oacc_static): New struture. (goacc_allocate_static, goacc_deallocate_static, GOACC_register_static, GOACC_declare): New functions. * testsuite/libgomp.oacc-c-c++-common/declare-1.c: New test. * testsuite/libgomp.oacc-c-c++-common/declare-5.c: Likewise.
diff --git a/gcc/builtin-types.def b/gcc/builtin-types.def index b561436..a109806 100644 --- a/gcc/builtin-types.def +++ b/gcc/builtin-types.def @@ -450,6 +450,7 @@ DEF_FUNCTION_TYPE_3 (BT_FN_BOOL_ULONG_ULONG_ULONGPTR, BT_BOOL, BT_ULONG, BT_ULONG, BT_PTR_ULONG) DEF_FUNCTION_TYPE_3 (BT_FN_BOOL_ULONGLONG_ULONGLONG_ULONGLONGPTR, BT_BOOL, BT_ULONGLONG, BT_ULONGLONG, BT_PTR_ULONGLONG) +DEF_FUNCTION_TYPE_3 (BT_FN_VOID_PTR_INT_UINT, BT_VOID, BT_PTR, BT_INT, BT_UINT) DEF_FUNCTION_TYPE_4 (BT_FN_SIZE_CONST_PTR_SIZE_SIZE_FILEPTR, BT_SIZE, BT_CONST_PTR, BT_SIZE, BT_SIZE, BT_FILEPTR) diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c index 1c75921..ffaa983 100644 --- a/gcc/c-family/c-common.c +++ b/gcc/c-family/c-common.c @@ -832,6 +832,7 @@ const struct attribute_spec c_common_attribute_table[] = handle_bnd_legacy, false }, { "bnd_instrument", 0, 0, true, false, false, handle_bnd_instrument, false }, + { "oacc declare", 0, -1, true, false, false, NULL, false }, { NULL, 0, 0, false, false, false, NULL, false } }; diff --git a/gcc/c-family/c-pragma.c b/gcc/c-family/c-pragma.c index 834a916..fbeb8ecd 100644 --- a/gcc/c-family/c-pragma.c +++ b/gcc/c-family/c-pragma.c @@ -1212,6 +1212,7 @@ struct omp_pragma_def { const char *name; unsigned int id; }; static const struct omp_pragma_def oacc_pragmas[] = { { "cache", PRAGMA_OACC_CACHE }, { "data", PRAGMA_OACC_DATA }, + { "declare", PRAGMA_OACC_DECLARE }, { "enter", PRAGMA_OACC_ENTER_DATA }, { "exit", PRAGMA_OACC_EXIT_DATA }, { "kernels", PRAGMA_OACC_KERNELS }, diff --git a/gcc/c-family/c-pragma.h b/gcc/c-family/c-pragma.h index cec920f..dcba221 100644 --- a/gcc/c-family/c-pragma.h +++ b/gcc/c-family/c-pragma.h @@ -29,6 +29,7 @@ enum pragma_kind { PRAGMA_OACC_CACHE, PRAGMA_OACC_DATA, + PRAGMA_OACC_DECLARE, PRAGMA_OACC_ENTER_DATA, PRAGMA_OACC_EXIT_DATA, PRAGMA_OACC_KERNELS, @@ -150,6 +151,7 @@ enum pragma_omp_clause { PRAGMA_OACC_CLAUSE_CREATE, PRAGMA_OACC_CLAUSE_DELETE, PRAGMA_OACC_CLAUSE_DEVICEPTR, + PRAGMA_OACC_CLAUSE_DEVICE_RESIDENT, PRAGMA_OACC_CLAUSE_GANG, PRAGMA_OACC_CLAUSE_HOST, PRAGMA_OACC_CLAUSE_NUM_GANGS, diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c index c8c6a2d..6af7229 100644 --- a/gcc/c/c-parser.c +++ b/gcc/c/c-parser.c @@ -1256,6 +1256,7 @@ static bool c_parser_omp_target (c_parser *, enum pragma_context); static void c_parser_omp_end_declare_target (c_parser *); static void c_parser_omp_declare (c_parser *, enum pragma_context); static bool c_parser_omp_ordered (c_parser *, enum pragma_context); +static void c_parser_oacc_declare (c_parser *parser); /* These Objective-C parser functions are only ever called when compiling Objective-C. */ @@ -9702,6 +9703,10 @@ c_parser_pragma (c_parser *parser, enum pragma_context context) switch (id) { + case PRAGMA_OACC_DECLARE: + c_parser_oacc_declare (parser); + return false; + case PRAGMA_OACC_ENTER_DATA: c_parser_oacc_enter_exit_data (parser, true); return false; @@ -9987,6 +9992,8 @@ c_parser_omp_clause_name (c_parser *parser) result = PRAGMA_OMP_CLAUSE_DEVICE; else if (!strcmp ("deviceptr", p)) result = PRAGMA_OACC_CLAUSE_DEVICEPTR; + else if (!strcmp ("device_resident", p)) + result = PRAGMA_OACC_CLAUSE_DEVICE_RESIDENT; else if (!strcmp ("dist_schedule", p)) result = PRAGMA_OMP_CLAUSE_DIST_SCHEDULE; break; @@ -10419,10 +10426,16 @@ c_parser_oacc_data_clause (c_parser *parser, pragma_omp_clause c_kind, case PRAGMA_OACC_CLAUSE_DEVICE: kind = GOMP_MAP_FORCE_TO; break; + case PRAGMA_OACC_CLAUSE_DEVICE_RESIDENT: + kind = GOMP_MAP_DEVICE_RESIDENT; + break; case PRAGMA_OACC_CLAUSE_HOST: case PRAGMA_OACC_CLAUSE_SELF: kind = GOMP_MAP_FORCE_FROM; break; + case PRAGMA_OMP_CLAUSE_LINK: + kind = GOMP_MAP_LINK; + break; case PRAGMA_OACC_CLAUSE_PRESENT: kind = GOMP_MAP_FORCE_PRESENT; break; @@ -12425,6 +12438,10 @@ c_parser_oacc_all_clauses (c_parser *parser, omp_clause_mask mask, clauses = c_parser_oacc_data_clause_deviceptr (parser, clauses); c_name = "deviceptr"; break; + case PRAGMA_OACC_CLAUSE_DEVICE_RESIDENT: + clauses = c_parser_oacc_data_clause (parser, c_kind, clauses); + c_name = "device_resident"; + break; case PRAGMA_OACC_CLAUSE_FIRSTPRIVATE: clauses = c_parser_omp_clause_firstprivate (parser, clauses); c_name = "firstprivate"; @@ -12437,6 +12454,10 @@ c_parser_oacc_all_clauses (c_parser *parser, omp_clause_mask mask, clauses = c_parser_omp_clause_if (parser, clauses, false); c_name = "if"; break; + case PRAGMA_OMP_CLAUSE_LINK: + clauses = c_parser_oacc_data_clause (parser, c_kind, clauses); + c_name = "link"; + break; case PRAGMA_OACC_CLAUSE_NUM_GANGS: clauses = c_parser_omp_clause_num_gangs (parser, clauses); c_name = "num_gangs"; @@ -12929,6 +12950,281 @@ c_parser_oacc_kernels (location_t loc, c_parser *parser, char *p_name) } /* OpenACC 2.0: + # pragma acc declare oacc-data-clause[optseq] new-line +*/ + +static int oacc_dcl_idx = 0; + + +#define OACC_DECLARE_CLAUSE_MASK \ + ( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_CREATE) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DEVICEPTR) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DEVICE_RESIDENT) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_LINK) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_PRESENT) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_PRESENT_OR_COPY) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_PRESENT_OR_COPYIN) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_PRESENT_OR_COPYOUT) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_PRESENT_OR_CREATE) ) + +static void +c_parser_oacc_declare (c_parser *parser) +{ + location_t pragma_loc = c_parser_peek_token (parser)->location; + tree c, clauses, ret_clauses, stmt, t; + + bool error = false; + + c_parser_consume_pragma (parser); + + clauses = c_parser_oacc_all_clauses (parser, OACC_DECLARE_CLAUSE_MASK, + "#pragma acc declare"); + if (!clauses) + { + error_at (pragma_loc, + "no valid clauses specified in %<#pragma acc declare%>"); + return; + } + + for (t = clauses; t; t = OMP_CLAUSE_CHAIN (t)) + { + location_t loc = OMP_CLAUSE_LOCATION (t); + tree decl = OMP_CLAUSE_DECL (t); + tree devres = NULL_TREE; + if (!DECL_P (decl)) + { + error_at (loc, "subarray in %<#pragma acc declare%>"); + error = true; + continue; + } + + switch (OMP_CLAUSE_MAP_KIND (t)) + { + case GOMP_MAP_FORCE_ALLOC: + case GOMP_MAP_FORCE_TO: + case GOMP_MAP_FORCE_DEVICEPTR: + break; + + case GOMP_MAP_DEVICE_RESIDENT: + devres = t; + break; + + case GOMP_MAP_POINTER: + /* Generated by c_finish_omp_clauses from array sections; + avoid spurious diagnostics. */ + break; + + case GOMP_MAP_LINK: + if (!global_bindings_p () && !DECL_EXTERNAL (decl)) + { + error_at (loc, + "%qD must be a global variable in" + "%<#pragma acc declare link%>", + decl); + error = true; + continue; + } + break; + + default: + if (global_bindings_p ()) + { + error_at (loc, "invalid OpenACC clause at file scope"); + error = true; + continue; + } + if (DECL_EXTERNAL (decl)) + { + error_at (loc, + "invalid use of %<extern%> variable %qD " + "in %<#pragma acc declare%>", decl); + error = true; + continue; + } + else if (TREE_PUBLIC (decl)) + { + error_at (loc, + "invalid use of %<global%> variable %qD " + "in %<#pragma acc declare%>", decl); + error = true; + continue; + } + break; + } + + tree decl_for_attr = decl; + tree prev_attr = lookup_attribute ("oacc declare", + DECL_ATTRIBUTES (decl)); + if (prev_attr) + { + tree p = TREE_VALUE (prev_attr); + tree cl = TREE_VALUE (p); + + if (!devres && OMP_CLAUSE_MAP_KIND (cl) != GOMP_MAP_DEVICE_RESIDENT) + { + error_at (loc, "variable %qD used more than once with " + "%<#pragma acc declare%>", decl); + inform (OMP_CLAUSE_LOCATION (cl), "previous directive was here"); + error = true; + continue; + } + } + + if (!error && global_bindings_p ()) + { + tree attr = tree_cons (NULL_TREE, clauses, NULL_TREE); + tree attrs = tree_cons (get_identifier ("oacc declare"), + attr, NULL_TREE); + decl_attributes (&decl_for_attr, attrs, 0); + } + } + + if (error) + return; + + ret_clauses = NULL_TREE; + + for (c = clauses; c; c = OMP_CLAUSE_CHAIN (c)) + { + bool ret = false; + HOST_WIDE_INT kind, new_op; + + kind = OMP_CLAUSE_MAP_KIND (c); + + switch (kind) + { + case GOMP_MAP_ALLOC: + case GOMP_MAP_FORCE_ALLOC: + case GOMP_MAP_FORCE_TO: + new_op = GOMP_MAP_FORCE_DEALLOC; + ret = true; + break; + + case GOMP_MAP_FORCE_FROM: + OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_FORCE_ALLOC); + new_op = GOMP_MAP_FORCE_FROM; + ret = true; + break; + + case GOMP_MAP_FORCE_TOFROM: + OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_FORCE_TO); + new_op = GOMP_MAP_FORCE_FROM; + ret = true; + break; + + case GOMP_MAP_FROM: + OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_FORCE_ALLOC); + new_op = GOMP_MAP_FROM; + ret = true; + break; + + case GOMP_MAP_TOFROM: + OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_TO); + new_op = GOMP_MAP_FROM; + ret = true; + break; + + case GOMP_MAP_DEVICE_RESIDENT: + case GOMP_MAP_FORCE_DEVICEPTR: + case GOMP_MAP_FORCE_PRESENT: + case GOMP_MAP_LINK: + case GOMP_MAP_POINTER: + case GOMP_MAP_TO: + break; + + default: + gcc_unreachable (); + break; + } + + if (ret) + { + t = copy_node (c); + + OMP_CLAUSE_SET_MAP_KIND (t, new_op); + + if (ret_clauses) + OMP_CLAUSE_CHAIN (t) = ret_clauses; + + ret_clauses = t; + } + } + + if (global_bindings_p ()) + { + char buf[128]; + struct c_declarator *target; + tree stmt, attrs; + c_arg_info *arg_info = build_arg_info (); + struct c_declarator *declarator; + struct c_declspecs *specs; + struct c_typespec spec; + location_t loc = UNKNOWN_LOCATION; + tree f, t, fnbody, call_fn; + + sprintf (buf, "__openacc_c_constructor__%d", oacc_dcl_idx++); + target = build_id_declarator (get_identifier (buf)); + arg_info->types = void_list_node; + declarator = build_function_declarator (arg_info, target); + + specs = build_null_declspecs (); + spec.kind = ctsk_resword; + spec.spec = get_identifier ("void"); + spec.expr = NULL_TREE; + spec.expr_const_operands = true; + + declspecs_add_type (pragma_loc, specs, spec); + finish_declspecs (specs); + + attrs = tree_cons (get_identifier ("constructor") , NULL_TREE, NULL_TREE); + start_function (specs, declarator, attrs); + store_parm_decls (); + f = c_begin_compound_stmt (true); + TREE_USED (current_function_decl) = 1; + call_fn = builtin_decl_explicit (BUILT_IN_GOACC_STATIC); + TREE_SIDE_EFFECTS (call_fn) = 1; + + for (t = clauses; t; t = OMP_CLAUSE_CHAIN (t)) + { + tree d, a1, a2, a3; + vec<tree, va_gc> *args; + vec_alloc (args, 3); + + d = OMP_CLAUSE_DECL (t); + + a1 = build_unary_op (loc, ADDR_EXPR, d, 0); + a2 = DECL_SIZE_UNIT (d); + a3 = build_int_cst (unsigned_type_node, OMP_CLAUSE_MAP_KIND (t)); + + args->quick_push (a1); + args->quick_push (a2); + args->quick_push (a3); + + stmt = build_function_call_vec (loc, vNULL, call_fn, args, NULL); + add_stmt (stmt); + } + + fnbody = c_end_compound_stmt (loc, f, true); + add_stmt (fnbody); + + finish_function (); + } + else + { + stmt = make_node (OACC_DECLARE); + TREE_TYPE (stmt) = void_type_node; + OACC_DECLARE_CLAUSES (stmt) = clauses; + OACC_DECLARE_RETURN_CLAUSES (stmt) = ret_clauses; + SET_EXPR_LOCATION (stmt, pragma_loc); + + add_stmt (stmt); + } +} + +/* OpenACC 2.0: # pragma acc enter data oacc-enter-data-clause[optseq] new-line or diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index 7555bf3..4ec16de 100644 --- a/gcc/cp/parser.c +++ b/gcc/cp/parser.c @@ -2179,6 +2179,9 @@ static vec<constructor_elt, va_gc> *cp_parser_initializer_list static bool cp_parser_ctor_initializer_opt_and_function_body (cp_parser *, bool); +static tree cp_parser_oacc_all_clauses (cp_parser *, omp_clause_mask, + const char *, cp_token *, bool); + static tree cp_parser_late_parsing_omp_declare_simd (cp_parser *, tree); @@ -29110,6 +29113,8 @@ cp_parser_omp_clause_name (cp_parser *parser) result = PRAGMA_OMP_CLAUSE_DEVICE; else if (!strcmp ("deviceptr", p)) result = PRAGMA_OACC_CLAUSE_DEVICEPTR; + else if (!strcmp ("device_resident", p)) + result = PRAGMA_OACC_CLAUSE_DEVICE_RESIDENT; else if (!strcmp ("dist_schedule", p)) result = PRAGMA_OMP_CLAUSE_DIST_SCHEDULE; break; @@ -29511,10 +29516,16 @@ cp_parser_oacc_data_clause (cp_parser *parser, pragma_omp_clause c_kind, case PRAGMA_OACC_CLAUSE_DEVICE: kind = GOMP_MAP_FORCE_TO; break; + case PRAGMA_OACC_CLAUSE_DEVICE_RESIDENT: + kind = GOMP_MAP_DEVICE_RESIDENT; + break; case PRAGMA_OACC_CLAUSE_HOST: case PRAGMA_OACC_CLAUSE_SELF: kind = GOMP_MAP_FORCE_FROM; break; + case PRAGMA_OMP_CLAUSE_LINK: + kind = GOMP_MAP_LINK; + break; case PRAGMA_OACC_CLAUSE_PRESENT: kind = GOMP_MAP_FORCE_PRESENT; break; @@ -29543,6 +29554,281 @@ cp_parser_oacc_data_clause (cp_parser *parser, pragma_omp_clause c_kind, } /* OpenACC 2.0: + # pragma acc declare oacc-data-clause[optseq] new-line +*/ + +static int oacc_dcl_idx = 0; + + +#define OACC_DECLARE_CLAUSE_MASK \ + ( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_CREATE) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DEVICEPTR) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DEVICE_RESIDENT) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OMP_CLAUSE_LINK) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_PRESENT) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_PRESENT_OR_COPY) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_PRESENT_OR_COPYIN) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_PRESENT_OR_COPYOUT) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_PRESENT_OR_CREATE)) + +static tree +cp_parser_oacc_declare (cp_parser *parser, cp_token *pragma_tok) +{ + tree c, clauses, ret_clauses, stmt, t; + bool error = false; + + clauses = cp_parser_oacc_all_clauses (parser, OACC_DECLARE_CLAUSE_MASK, + "#pragma acc declare", pragma_tok, true); + + + if (find_omp_clause (clauses, OMP_CLAUSE_MAP) == NULL_TREE) + { + error_at (pragma_tok->location, + "no valid clauses specified in %<#pragma acc declare%>"); + return NULL_TREE; + } + + for (tree t = clauses; t; t = OMP_CLAUSE_CHAIN (t)) + { + location_t loc = OMP_CLAUSE_LOCATION (t); + tree decl = OMP_CLAUSE_DECL (t); + tree devres = NULL_TREE; + if (!DECL_P (decl)) + { + error_at (loc, "subarray in %<#pragma acc declare%>"); + error = true; + continue; + } + gcc_assert (OMP_CLAUSE_CODE (t) == OMP_CLAUSE_MAP); + switch (OMP_CLAUSE_MAP_KIND (t)) + { + case GOMP_MAP_FORCE_ALLOC: + case GOMP_MAP_FORCE_TO: + case GOMP_MAP_FORCE_DEVICEPTR: + break; + + case GOMP_MAP_DEVICE_RESIDENT: + devres = t; + break; + + case GOMP_MAP_POINTER: + /* Generated by c_finish_omp_clauses from array sections; + avoid spurious diagnostics. */ + break; + + case GOMP_MAP_LINK: + if (!global_bindings_p () && !DECL_EXTERNAL (decl)) + { + error_at (loc, + "%qD must be a global variable in" + "%<#pragma acc declare link%>", + decl); + error = true; + continue; + } + break; + + default: + if (global_bindings_p ()) + { + error_at (loc, "invalid OpenACC clause at file scope"); + error = true; + continue; + } + if (DECL_EXTERNAL (decl)) + { + error_at (loc, + "invalid use of %<extern%> variable %qD " + "in %<#pragma acc declare%>", decl); + error = true; + continue; + } + else if (TREE_PUBLIC (decl)) + { + error_at (loc, + "invalid use of %<global%> variable %qD " + "in %<#pragma acc declare%>", decl); + error = true; + continue; + } + break; + } + + tree decl_for_attr = decl; + tree prev_attr = lookup_attribute ("oacc declare", + DECL_ATTRIBUTES (decl)); + if (prev_attr) + { + tree p = TREE_VALUE (prev_attr); + tree cl = TREE_VALUE (p); + + if (!devres && OMP_CLAUSE_MAP_KIND (cl) != GOMP_MAP_DEVICE_RESIDENT) + { + error_at (loc, "variable %qD used more than once with " + "%<#pragma acc declare%>", decl); + inform (OMP_CLAUSE_LOCATION (TREE_VALUE (p)), + "previous directive was here"); + error = true; + continue; + } + } + + if (!error && global_bindings_p ()) + { + tree attr = tree_cons (NULL_TREE, t, NULL_TREE); + tree attrs = tree_cons (get_identifier ("oacc declare"), + attr, NULL_TREE); + decl_attributes (&decl_for_attr, attrs, 0); + } + } + + if (error) + return NULL_TREE; + + ret_clauses = NULL_TREE; + + for (c = clauses; c; c = OMP_CLAUSE_CHAIN (c)) + { + bool ret = false; + HOST_WIDE_INT kind, new_op; + + kind = OMP_CLAUSE_MAP_KIND (c); + + switch (kind) + { + case GOMP_MAP_ALLOC: + case GOMP_MAP_FORCE_ALLOC: + case GOMP_MAP_FORCE_TO: + new_op = GOMP_MAP_FORCE_DEALLOC; + ret = true; + break; + + case GOMP_MAP_FORCE_FROM: + OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_FORCE_ALLOC); + new_op = GOMP_MAP_FORCE_FROM; + ret = true; + break; + + case GOMP_MAP_FORCE_TOFROM: + OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_FORCE_TO); + new_op = GOMP_MAP_FORCE_FROM; + ret = true; + break; + + case GOMP_MAP_FROM: + OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_FORCE_ALLOC); + new_op = GOMP_MAP_FROM; + ret = true; + break; + + case GOMP_MAP_TOFROM: + OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_TO); + new_op = GOMP_MAP_FROM; + ret = true; + break; + + case GOMP_MAP_DEVICE_RESIDENT: + case GOMP_MAP_FORCE_DEVICEPTR: + case GOMP_MAP_FORCE_PRESENT: + case GOMP_MAP_POINTER: + case GOMP_MAP_TO: + break; + + case GOMP_MAP_LINK: + continue; + + default: + gcc_unreachable (); + break; + } + + if (ret) + { + t = copy_node (c); + + OMP_CLAUSE_SET_MAP_KIND (t, new_op); + + if (ret_clauses) + OMP_CLAUSE_CHAIN (t) = ret_clauses; + + ret_clauses = t; + } + } + + if (global_bindings_p ()) + { + char buf[128]; + cp_decl_specifier_seq decl_specifiers; + cp_declarator *declarator; + tree attrs, parms; + tree f, t, call_fn, stmt; + location_t loc = UNKNOWN_LOCATION; + void *p; + + p = obstack_alloc (&declarator_obstack, 0); + clear_decl_specs (&decl_specifiers); + decl_specifiers.type = void_type_node; + sprintf (buf, "__openacc_cp_constructor__%d", oacc_dcl_idx++); + + declarator = make_id_declarator (NULL_TREE, get_identifier (buf), + sfk_none); + parms = void_list_node; + declarator = make_call_declarator (declarator, parms, + TYPE_UNQUALIFIED, + VIRT_SPEC_UNSPECIFIED, + REF_QUAL_NONE, + /*tx_qualifier=*/NULL_TREE, + /*exception_specification=*/NULL_TREE, + /*late_return_type=*/NULL_TREE, + /*requires_clause=*/NULL_TREE); + attrs = tree_cons (get_identifier ("constructor") , NULL_TREE, NULL_TREE); + start_function (&decl_specifiers, declarator, attrs); + f = begin_compound_stmt (0); + call_fn = builtin_decl_explicit (BUILT_IN_GOACC_STATIC); + TREE_SIDE_EFFECTS (call_fn) = 1; + + for (t = clauses; t; t = OMP_CLAUSE_CHAIN (t)) + { + tree d, a1, a2, a3; + vec<tree, va_gc> *args; + vec_alloc (args, 3); + + d = OMP_CLAUSE_DECL (t); + + a1 = build_unary_op (loc, ADDR_EXPR, d, 0); + a2 = DECL_SIZE_UNIT (d); + a3 = build_int_cst (unsigned_type_node, OMP_CLAUSE_MAP_KIND (t)); + + args->quick_push (a1); + args->quick_push (a2); + args->quick_push (a3); + + stmt = build_function_call_vec (loc, vNULL, call_fn, args, NULL); + finish_expr_stmt (stmt); + } + + finish_compound_stmt (f); + expand_or_defer_fn (finish_function (0)); + obstack_free (&declarator_obstack, p); + } + else + { + stmt = make_node (OACC_DECLARE); + TREE_TYPE (stmt) = void_type_node; + OACC_DECLARE_CLAUSES (stmt) = clauses; + OACC_DECLARE_RETURN_CLAUSES (stmt) = ret_clauses; + SET_EXPR_LOCATION (stmt, pragma_tok->location); + + add_stmt (stmt); + } + + return NULL_TREE; +} + +/* OpenACC 2.0: deviceptr ( variable-list ) */ static tree @@ -31338,6 +31624,10 @@ cp_parser_oacc_all_clauses (cp_parser *parser, omp_clause_mask mask, clauses = cp_parser_oacc_data_clause_deviceptr (parser, clauses); c_name = "deviceptr"; break; + case PRAGMA_OACC_CLAUSE_DEVICE_RESIDENT: + clauses = cp_parser_oacc_data_clause (parser, c_kind, clauses); + c_name = "device_resident"; + break; case PRAGMA_OACC_CLAUSE_HOST: clauses = cp_parser_oacc_data_clause (parser, c_kind, clauses); c_name = "host"; @@ -31346,6 +31636,10 @@ cp_parser_oacc_all_clauses (cp_parser *parser, omp_clause_mask mask, clauses = cp_parser_omp_clause_if (parser, clauses, here, false); c_name = "if"; break; + case PRAGMA_OMP_CLAUSE_LINK: + clauses = cp_parser_oacc_data_clause (parser, c_kind, clauses); + c_name = "link"; + break; case PRAGMA_OACC_CLAUSE_NUM_GANGS: clauses = cp_parser_omp_clause_num_gangs (parser, clauses); c_name = "num_gangs"; @@ -35853,6 +36147,10 @@ cp_parser_pragma (cp_parser *parser, enum pragma_context context) cp_parser_omp_declare (parser, pragma_tok, context); return false; + case PRAGMA_OACC_DECLARE: + cp_parser_oacc_declare (parser, pragma_tok); + return false; + case PRAGMA_OACC_CACHE: case PRAGMA_OACC_DATA: case PRAGMA_OACC_ENTER_DATA: diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index 142245a..85ed05d 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -15329,6 +15329,17 @@ tsubst_expr (tree t, tree args, tsubst_flags_t complain, tree in_decl, add_stmt (t); break; + case OACC_DECLARE: + t = copy_node (t); + tmp = tsubst_omp_clauses (OACC_DECLARE_CLAUSES (t), false, false, + args, complain, in_decl); + OACC_DECLARE_CLAUSES (t) = tmp; + tmp = tsubst_omp_clauses (OACC_DECLARE_RETURN_CLAUSES (t), false, false, + args, complain, in_decl); + OACC_DECLARE_RETURN_CLAUSES (t) = tmp; + add_stmt (t); + break; + case OMP_TARGET_UPDATE: case OMP_TARGET_ENTER_DATA: case OMP_TARGET_EXIT_DATA: diff --git a/gcc/fortran/types.def b/gcc/fortran/types.def index ca75654..6d993db 100644 --- a/gcc/fortran/types.def +++ b/gcc/fortran/types.def @@ -145,6 +145,7 @@ DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I2_INT, BT_VOID, BT_VOLATILE_PTR, BT_I2, BT DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I4_INT, BT_VOID, BT_VOLATILE_PTR, BT_I4, BT_INT) DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I8_INT, BT_VOID, BT_VOLATILE_PTR, BT_I8, BT_INT) DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I16_INT, BT_VOID, BT_VOLATILE_PTR, BT_I16, BT_INT) +DEF_FUNCTION_TYPE_3 (BT_FN_VOID_PTR_INT_UINT, BT_VOID, BT_PTR, BT_INT, BT_UINT) DEF_FUNCTION_TYPE_4 (BT_FN_VOID_OMPFN_PTR_UINT_UINT, BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR, BT_UINT, BT_UINT) diff --git a/gcc/gimple.h b/gcc/gimple.h index 02d0db5..2b1d9e7 100644 --- a/gcc/gimple.h +++ b/gcc/gimple.h @@ -170,6 +170,7 @@ enum gf_mask { GF_OMP_TARGET_KIND_OACC_DATA = 7, GF_OMP_TARGET_KIND_OACC_UPDATE = 8, GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA = 9, + GF_OMP_TARGET_KIND_OACC_DECLARE = 10, /* True on an GIMPLE_OMP_RETURN statement if the return does not require a thread synchronization via some sort of barrier. The exact barrier @@ -5989,6 +5990,7 @@ is_gimple_omp_oacc (const gimple *stmt) case GF_OMP_TARGET_KIND_OACC_DATA: case GF_OMP_TARGET_KIND_OACC_UPDATE: case GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA: + case GF_OMP_TARGET_KIND_OACC_DECLARE: return true; default: return false; diff --git a/gcc/gimplify.c b/gcc/gimplify.c index ab9e540..0bf5ce69 100644 --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -163,6 +163,7 @@ struct gimplify_omp_ctx bool target_map_scalars_firstprivate; bool target_map_pointers_as_0len_arrays; bool target_firstprivatize_array_bases; + gomp_target *declare_returns; }; static struct gimplify_ctx *gimplify_ctxp; @@ -382,6 +383,7 @@ new_omp_context (enum omp_region_type region_type) c->default_kind = OMP_CLAUSE_DEFAULT_SHARED; else c->default_kind = OMP_CLAUSE_DEFAULT_UNSPECIFIED; + c->declare_returns = NULL; return c; } @@ -5798,6 +5800,26 @@ omp_notice_threadprivate_variable (struct gimplify_omp_ctx *ctx, tree decl, return false; } +/* Return true if global var DECL is device resident. */ + +static bool +device_resident_p (tree decl) +{ + tree attr = lookup_attribute ("oacc declare", DECL_ATTRIBUTES (decl)); + + if (!attr) + return false; + + for (tree t = TREE_VALUE (attr); t; t = TREE_PURPOSE (t)) + { + tree c = TREE_VALUE (t); + if (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DEVICE_RESIDENT) + return true; + } + + return false; +} + /* Determine outer default flags for DECL mentioned in an OMP region but not declared in an enclosing clause. @@ -5847,6 +5869,8 @@ omp_default_clause (struct gimplify_omp_ctx *ctx, tree decl, flags |= GOVD_FIRSTPRIVATE; break; case OMP_CLAUSE_DEFAULT_UNSPECIFIED: + if (is_global_var (decl) && device_resident_p (decl)) + flags |= GOVD_MAP_TO_ONLY | GOVD_MAP; /* decl will be either GOVD_FIRSTPRIVATE or GOVD_SHARED. */ gcc_assert ((ctx->region_type & ORT_TASK) != 0); if (struct gimplify_omp_ctx *octx = ctx->outer_context) @@ -7529,6 +7553,62 @@ gimplify_oacc_cache (tree *expr_p, gimple_seq *pre_p) *expr_p = NULL_TREE; } +/* Gimplify OACC_DECLARE. */ + +static void +gimplify_oacc_declare (tree *expr_p, gimple_seq *pre_p) +{ + tree expr = *expr_p; + gomp_target *stmt; + tree clauses, t; + + clauses = OACC_DECLARE_CLAUSES (expr); + + gimplify_scan_omp_clauses (&clauses, pre_p, ORT_TARGET_DATA, OACC_DECLARE); + + for (t = clauses; t; t = OMP_CLAUSE_CHAIN (t)) + { + tree attrs, decl = OMP_CLAUSE_DECL (t); + + if (TREE_CODE (decl) == MEM_REF) + continue; + + omp_add_variable (gimplify_omp_ctxp, decl, GOVD_SEEN); + + attrs = lookup_attribute ("oacc declare", DECL_ATTRIBUTES (decl)); + if (attrs) + DECL_ATTRIBUTES (decl) = remove_attribute ("oacc declare", attrs); + } + + stmt = gimple_build_omp_target (NULL, GF_OMP_TARGET_KIND_OACC_DECLARE, + clauses); + + gimplify_seq_add_stmt (pre_p, stmt); + + clauses = OACC_DECLARE_RETURN_CLAUSES (expr); + if (clauses) + { + struct gimplify_omp_ctx *c; + + /* Any clauses that affect the state of a variable prior + to return are saved and dealt with after the body has + been gimplified. */ + + gimplify_scan_omp_clauses (&clauses, pre_p, ORT_TARGET_DATA, + OACC_DECLARE); + + c = gimplify_omp_ctxp; + gimplify_omp_ctxp = c->outer_context; + delete_omp_context (c); + + stmt = gimple_build_omp_target (NULL, GF_OMP_TARGET_KIND_OACC_DECLARE, + clauses); + gimplify_omp_ctxp->declare_returns = stmt; + } + + *expr_p = NULL_TREE; +} + /* Gimplify the contents of an OMP_PARALLEL statement. This involves gimplification of the body, as well as scanning the body for used variables. We need to do this scan now, because variable-sized @@ -9595,11 +9675,15 @@ gimplify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p, break; case OACC_HOST_DATA: - case OACC_DECLARE: sorry ("directive not yet implemented"); ret = GS_ALL_DONE; break; + case OACC_DECLARE: + gimplify_oacc_declare (expr_p, pre_p); + ret = GS_ALL_DONE; + break; + case OACC_KERNELS: if (OACC_KERNELS_COMBINED (*expr_p)) sorry ("directive not yet implemented"); @@ -10274,6 +10358,28 @@ gimplify_body (tree fndecl, bool do_parms) gimplify_seq_add_stmt (&seq, outer_stmt); } + if (flag_openacc && gimplify_omp_ctxp) + { + while (gimplify_omp_ctxp) + { + struct gimplify_omp_ctx *c; + + if (gimplify_omp_ctxp->declare_returns) + { + /* Clauses are present that affect the state of a + variable, insert the statment to handle this + as the very last statement. */ + + gimplify_seq_add_stmt (&seq, gimplify_omp_ctxp->declare_returns); + gimplify_omp_ctxp->declare_returns = NULL; + } + + c = gimplify_omp_ctxp; + gimplify_omp_ctxp = c->outer_context; + delete_omp_context (c); + } + } + /* The body must contain exactly one statement, a GIMPLE_BIND. If this is not the case, wrap everything in a GIMPLE_BIND to make it so. */ if (gimple_code (outer_stmt) == GIMPLE_BIND diff --git a/gcc/omp-builtins.def b/gcc/omp-builtins.def index ea9cf0d..4af3640 100644 --- a/gcc/omp-builtins.def +++ b/gcc/omp-builtins.def @@ -317,3 +317,7 @@ DEF_GOMP_BUILTIN (BUILT_IN_GOMP_TARGET_ENTER_EXIT_DATA, BT_FN_VOID_INT_SIZE_PTR_PTR_PTR_UINT_PTR, ATTR_NOTHROW_LIST) DEF_GOMP_BUILTIN (BUILT_IN_GOMP_TEAMS, "GOMP_teams", BT_FN_VOID_UINT_UINT, ATTR_NOTHROW_LIST) +DEF_GOACC_BUILTIN (BUILT_IN_GOACC_STATIC, "GOACC_register_static", + BT_FN_VOID_PTR_INT_UINT, ATTR_NOTHROW_LIST) +DEF_GOACC_BUILTIN (BUILT_IN_GOACC_DECLARE, "GOACC_declare", + BT_FN_VOID_INT_SIZE_PTR_PTR_PTR, ATTR_NOTHROW_LIST) diff --git a/gcc/omp-low.c b/gcc/omp-low.c index ad7c017..2e75c8f 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -11330,6 +11330,7 @@ expand_omp_target (struct omp_region *region) case GF_OMP_TARGET_KIND_OACC_KERNELS: case GF_OMP_TARGET_KIND_OACC_UPDATE: case GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA: + case GF_OMP_TARGET_KIND_OACC_DECLARE: data_region = false; break; case GF_OMP_TARGET_KIND_DATA: @@ -11563,6 +11564,9 @@ expand_omp_target (struct omp_region *region) case GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA: start_ix = BUILT_IN_GOACC_ENTER_EXIT_DATA; break; + case GF_OMP_TARGET_KIND_OACC_DECLARE: + start_ix = BUILT_IN_GOACC_DECLARE; + break; default: gcc_unreachable (); } @@ -11685,6 +11689,7 @@ expand_omp_target (struct omp_region *region) switch (start_ix) { case BUILT_IN_GOACC_DATA_START: + case BUILT_IN_GOACC_DECLARE: case BUILT_IN_GOMP_TARGET_DATA: break; case BUILT_IN_GOMP_TARGET: @@ -11967,6 +11972,7 @@ build_omp_regions_1 (basic_block bb, struct omp_region *parent, case GF_OMP_TARGET_KIND_EXIT_DATA: case GF_OMP_TARGET_KIND_OACC_UPDATE: case GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA: + case GF_OMP_TARGET_KIND_OACC_DECLARE: /* ..., other than for those stand-alone directives... */ region = NULL; break; @@ -14095,6 +14101,7 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx) case GF_OMP_TARGET_KIND_OACC_KERNELS: case GF_OMP_TARGET_KIND_OACC_UPDATE: case GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA: + case GF_OMP_TARGET_KIND_OACC_DECLARE: data_region = false; break; case GF_OMP_TARGET_KIND_DATA: @@ -14169,6 +14176,8 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx) case GOMP_MAP_FORCE_TOFROM: case GOMP_MAP_FORCE_PRESENT: case GOMP_MAP_FORCE_DEVICEPTR: + case GOMP_MAP_DEVICE_RESIDENT: + case GOMP_MAP_LINK: gcc_assert (is_gimple_omp_oacc (stmt)); break; default: @@ -15839,6 +15848,7 @@ make_gimple_omp_edges (basic_block bb, struct omp_region **region, case GF_OMP_TARGET_KIND_EXIT_DATA: case GF_OMP_TARGET_KIND_OACC_UPDATE: case GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA: + case GF_OMP_TARGET_KIND_OACC_DECLARE: cur_region = cur_region->outer; break; default: diff --git a/gcc/testsuite/c-c++-common/goacc/declare-1.c b/gcc/testsuite/c-c++-common/goacc/declare-1.c new file mode 100644 index 0000000..b036c63 --- /dev/null +++ b/gcc/testsuite/c-c++-common/goacc/declare-1.c @@ -0,0 +1,83 @@ +/* Test valid uses of declare directive. */ +/* { dg-do compile } */ + +int v0; +#pragma acc declare create(v0) + +int v1; +#pragma acc declare copyin(v1) + +int *v2; +#pragma acc declare deviceptr(v2) + +int v3; +#pragma acc declare device_resident(v3) + +int v4; +#pragma acc declare link(v4) + +int v5, v6, v7, v8; +#pragma acc declare create(v5, v6) copyin(v7, v8) + +void +f (void) +{ + int va0; +#pragma acc declare create(va0) + + int va1; +#pragma acc declare copyin(va1) + + int *va2; +#pragma acc declare deviceptr(va2) + + int va3; +#pragma acc declare device_resident(va3) + + extern int ve0; +#pragma acc declare create(ve0) + + extern int ve1; +#pragma acc declare copyin(ve1) + + extern int *ve2; +#pragma acc declare deviceptr(ve2) + + extern int ve3; +#pragma acc declare device_resident(ve3) + + extern int ve4; +#pragma acc declare link(ve4) + + int va5; +#pragma acc declare copy(va5) + + int va6; +#pragma acc declare copyout(va6) + + int va7; +#pragma acc declare present(va7) + + int va8; +#pragma acc declare present_or_copy(va8) + + int va9; +#pragma acc declare present_or_copyin(va9) + + int va10; +#pragma acc declare present_or_copyout(va10) + + int va11; +#pragma acc declare present_or_create(va11) + + a: + { + int va0; +#pragma acc declare create(va0) + if (v1) + goto a; + else + goto b; + } + b:; +} diff --git a/gcc/testsuite/c-c++-common/goacc/declare-2.c b/gcc/testsuite/c-c++-common/goacc/declare-2.c new file mode 100644 index 0000000..7979f0c --- /dev/null +++ b/gcc/testsuite/c-c++-common/goacc/declare-2.c @@ -0,0 +1,68 @@ +/* Test invalid uses of declare directive. */ +/* { dg-do compile } */ + +#pragma acc declare /* { dg-error "no valid clauses" } */ + +#pragma acc declare create(undeclared) /* { dg-error "undeclared" } */ +/* { dg-error "no valid clauses" "second error" { target *-*-* } 6 } */ + +int v0[10]; +#pragma acc declare create(v0[1:3]) /* { dg-error "subarray" } */ + +int v1; +#pragma acc declare create(v1, v1) /* { dg-error "more than once" } */ + +int v2; +#pragma acc declare create(v2) /* { dg-message "previous directive" } */ +#pragma acc declare copyin(v2) /* { dg-error "more than once" } */ + +int v3; +#pragma acc declare copy(v3) /* { dg-error "at file scope" } */ + +int v4; +#pragma acc declare copyout(v4) /* { dg-error "at file scope" } */ + +int v5; +#pragma acc declare present(v5) /* { dg-error "at file scope" } */ + +int v6; +#pragma acc declare present_or_copy(v6) /* { dg-error "at file scope" } */ + +int v7; +#pragma acc declare present_or_copyin(v7) /* { dg-error "at file scope" } */ + +int v8; +#pragma acc declare present_or_copyout(v8) /* { dg-error "at file scope" } */ + +int v9; +#pragma acc declare present_or_create(v9) /* { dg-error "at file scope" } */ + +void +f (void) +{ + int va0; +#pragma acc declare link(va0) /* { dg-error "global variable" } */ + + extern int ve0; +#pragma acc declare copy(ve0) /* { dg-error "invalid use of" } */ + + extern int ve1; +#pragma acc declare copyout(ve1) /* { dg-error "invalid use of" } */ + + extern int ve2; +#pragma acc declare present(ve2) /* { dg-error "invalid use of" } */ + + extern int ve3; +#pragma acc declare present_or_copy(ve3) /* { dg-error "invalid use of" } */ + + extern int ve4; +#pragma acc declare present_or_copyin(ve4) /* { dg-error "invalid use of" } */ + + extern int ve5; +#pragma acc declare present_or_copyout(ve5) /* { dg-error "invalid use of" } */ + + extern int ve6; +#pragma acc declare present_or_create(ve6) /* { dg-error "invalid use of" } */ + +#pragma acc declare present (v9) /* { dg-error "invalid use of" } */ +} diff --git a/gcc/tree.def b/gcc/tree.def index d0a3bd6..d183c5c 100644 --- a/gcc/tree.def +++ b/gcc/tree.def @@ -1186,8 +1186,9 @@ DEFTREECODE (OMP_TASKGROUP, "omp_taskgroup", tcc_statement, 1) DEFTREECODE (OACC_CACHE, "oacc_cache", tcc_statement, 1) /* OpenACC - #pragma acc declare [clause1 ... clauseN] - Operand 0: OACC_DECLARE_CLAUSES: List of clauses. */ -DEFTREECODE (OACC_DECLARE, "oacc_declare", tcc_statement, 1) + Operand 0: OACC_DECLARE_CLAUSES: List of clauses. + Operand 1: OACC_DECLARE_RETURN_CLAUSES: List of clauses for returns. */ +DEFTREECODE (OACC_DECLARE, "oacc_declare", tcc_statement, 2) /* OpenACC - #pragma acc enter data [clause1 ... clauseN] Operand 0: OACC_ENTER_DATA_CLAUSES: List of clauses. */ diff --git a/gcc/tree.h b/gcc/tree.h index ece083b..f325ea6 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -1243,6 +1243,8 @@ extern void protected_set_expr_location (tree, location_t); #define OACC_DECLARE_CLAUSES(NODE) \ TREE_OPERAND (OACC_DECLARE_CHECK (NODE), 0) +#define OACC_DECLARE_RETURN_CLAUSES(NODE) \ + TREE_OPERAND (OACC_DECLARE_CHECK (NODE), 1) #define OACC_ENTER_DATA_CLAUSES(NODE) \ TREE_OPERAND (OACC_ENTER_DATA_CHECK (NODE), 0) diff --git a/gcc/varpool.c b/gcc/varpool.c index 7d11e20..c83877f 100644 --- a/gcc/varpool.c +++ b/gcc/varpool.c @@ -143,6 +143,42 @@ varpool_node::create_empty (void) return node; } +static void +make_offloadable_1 (varpool_node *node, tree decl ATTRIBUTE_UNUSED) +{ + node->offloadable = 1; +#ifdef ENABLE_OFFLOADING + g->have_offload = true; + if (!in_lto_p) + vec_safe_push (offload_vars, decl); + node->force_output = 1; +#endif +} + +void +make_offloadable (varpool_node *node, tree decl) +{ + tree attrs; + + if (node->offloadable) + return; + + if (flag_openmp) + { + make_offloadable_1 (node, decl); + return; + } + + attrs = lookup_attribute ("oacc declare", DECL_ATTRIBUTES (decl)); + if (attrs) + { + make_offloadable_1 (node, decl); + + DECL_ATTRIBUTES (decl) + = remove_attribute ("oacc declare", DECL_ATTRIBUTES (decl)); + } +} + /* Return varpool node assigned to DECL. Create new one when needed. */ varpool_node * varpool_node::get_create (tree decl) @@ -150,22 +186,18 @@ varpool_node::get_create (tree decl) varpool_node *node = varpool_node::get (decl); gcc_checking_assert (TREE_CODE (decl) == VAR_DECL); if (node) - return node; + { + if (flag_openacc && !DECL_EXTERNAL (decl)) + make_offloadable (node, decl); + return node; + } node = varpool_node::create_empty (); node->decl = decl; if ((flag_openacc || flag_openmp) && !DECL_EXTERNAL (decl) && lookup_attribute ("omp declare target", DECL_ATTRIBUTES (decl))) - { - node->offloadable = 1; -#ifdef ENABLE_OFFLOADING - g->have_offload = true; - if (!in_lto_p) - vec_safe_push (offload_vars, decl); - node->force_output = 1; -#endif - } + make_offloadable (node, decl); node->register_symbol (); return node; diff --git a/include/gomp-constants.h b/include/gomp-constants.h index f834dec..4128912 100644 --- a/include/gomp-constants.h +++ b/include/gomp-constants.h @@ -73,6 +73,11 @@ enum gomp_map_kind POINTER_SIZE_UNITS. */ GOMP_MAP_FORCE_DEVICEPTR = (GOMP_MAP_FLAG_SPECIAL_1 | 0), /* Do not map, copy bits for firstprivate instead. */ + /* OpenACC device_resident. */ + GOMP_MAP_DEVICE_RESIDENT = (GOMP_MAP_FLAG_SPECIAL_1 | 1), + /* OpenACC link. */ + GOMP_MAP_LINK = (GOMP_MAP_FLAG_SPECIAL_1 | 2), + /* Allocate. */ GOMP_MAP_FIRSTPRIVATE = (GOMP_MAP_FLAG_SPECIAL | 0), /* Similarly, but store the value in the pointer rather than pointed by the pointer. */ diff --git a/libgomp/libgomp.map b/libgomp/libgomp.map index 2153661..5f97e32 100644 --- a/libgomp/libgomp.map +++ b/libgomp/libgomp.map @@ -372,12 +372,14 @@ GOACC_2.0 { global: GOACC_data_end; GOACC_data_start; + GOACC_declare; GOACC_enter_exit_data; GOACC_parallel; GOACC_update; GOACC_wait; GOACC_get_thread_num; GOACC_get_num_threads; + GOACC_register_static; }; GOACC_2.0.1 { diff --git a/libgomp/oacc-init.c b/libgomp/oacc-init.c index a0e62a4..6f31e2a 100644 --- a/libgomp/oacc-init.c +++ b/libgomp/oacc-init.c @@ -250,6 +250,8 @@ acc_shutdown_1 (acc_device_t d) /* Get the base device for this device type. */ base_dev = resolve_device (d, true); + goacc_deallocate_static (d); + ndevs = base_dev->get_num_devices_func (); /* Unload all the devices of this type that have been opened. */ @@ -432,7 +434,9 @@ goacc_attach_host_thread_to_device (int ord) void acc_init (acc_device_t d) { - if (!cached_base_dev) + bool init = !cached_base_dev; + + if (init) gomp_init_targets_once (); gomp_mutex_lock (&acc_device_lock); @@ -440,6 +444,9 @@ acc_init (acc_device_t d) cached_base_dev = acc_init_1 (d); gomp_mutex_unlock (&acc_device_lock); + + if (init) + goacc_allocate_static (d); goacc_attach_host_thread_to_device (-1); } diff --git a/libgomp/oacc-int.h b/libgomp/oacc-int.h index f11e216..560f68b 100644 --- a/libgomp/oacc-int.h +++ b/libgomp/oacc-int.h @@ -99,6 +99,9 @@ void goacc_restore_bind (void); void goacc_lazy_initialize (void); void goacc_host_init (void); +void goacc_allocate_static (acc_device_t); +void goacc_deallocate_static (acc_device_t); + #ifdef HAVE_ATTRIBUTE_VISIBILITY # pragma GCC visibility pop #endif diff --git a/libgomp/oacc-parallel.c b/libgomp/oacc-parallel.c index b150106..e374045 100644 --- a/libgomp/oacc-parallel.c +++ b/libgomp/oacc-parallel.c @@ -49,6 +49,68 @@ find_pset (int pos, size_t mapnum, unsigned short *kinds) return kind == GOMP_MAP_TO_PSET; } +static struct oacc_static +{ + void *addr; + size_t size; + unsigned short mask; + bool free; + struct oacc_static *next; +} *oacc_statics; + +static bool alloc_done = false; + +void +goacc_allocate_static (acc_device_t d) +{ + struct oacc_static *s; + + if (alloc_done) + assert (0); + + for (s = oacc_statics; s; s = s->next) + { + void *d; + + switch (s->mask) + { + case GOMP_MAP_FORCE_ALLOC: + break; + + case GOMP_MAP_FORCE_TO: + d = acc_deviceptr (s->addr); + acc_memcpy_to_device (d, s->addr, s->size); + break; + + case GOMP_MAP_FORCE_DEVICEPTR: + case GOMP_MAP_DEVICE_RESIDENT: + case GOMP_MAP_LINK: + break; + + default: + assert (0); + break; + } + } + + alloc_done = true; +} + +void +goacc_deallocate_static (acc_device_t d) +{ + struct oacc_static *s; + unsigned short mask = GOMP_MAP_FORCE_DEALLOC; + + if (!alloc_done) + return; + + for (s = oacc_statics; s; s = s->next) + GOACC_enter_exit_data (d, 1, &s->addr, &s->size, &mask, 0, 0); + + alloc_done = false; +} + static void goacc_wait (int async, int num_waits, va_list *ap); @@ -501,3 +563,77 @@ GOACC_get_thread_num (void) { return 0; } + +void +GOACC_register_static (void *addr, int size, unsigned int mask) +{ + struct oacc_static *s; + + s = (struct oacc_static *) malloc (sizeof (struct oacc_static)); + s->addr = addr; + s->size = (size_t) size; + s->mask = mask; + s->free = false; + s->next = oacc_statics; + + oacc_statics = s; +} + +void +GOACC_declare (int device, size_t mapnum, + void **hostaddrs, size_t *sizes, unsigned short *kinds) +{ + int i; + + for (i = 0; i < mapnum; i++) + { + unsigned char kind = kinds[i] & 0xff; + + if (kind == GOMP_MAP_POINTER || kind == GOMP_MAP_TO_PSET) + continue; + + switch (kind) + { + case GOMP_MAP_FORCE_ALLOC: + case GOMP_MAP_FORCE_DEALLOC: + case GOMP_MAP_FORCE_FROM: + case GOMP_MAP_FORCE_TO: + case GOMP_MAP_POINTER: + GOACC_enter_exit_data (device, 1, &hostaddrs[i], &sizes[i], + &kinds[i], 0, 0); + break; + + case GOMP_MAP_FORCE_DEVICEPTR: + break; + + case GOMP_MAP_ALLOC: + if (!acc_is_present (hostaddrs[i], sizes[i])) + { + GOACC_enter_exit_data (device, 1, &hostaddrs[i], &sizes[i], + &kinds[i], 0, 0); + } + break; + + case GOMP_MAP_TO: + GOACC_enter_exit_data (device, 1, &hostaddrs[i], &sizes[i], + &kinds[i], 0, 0); + + break; + + case GOMP_MAP_FROM: + kinds[i] = GOMP_MAP_FORCE_FROM; + GOACC_enter_exit_data (device, 1, &hostaddrs[i], &sizes[i], + &kinds[i], 0, 0); + break; + + case GOMP_MAP_FORCE_PRESENT: + if (!acc_is_present (hostaddrs[i], sizes[i])) + gomp_fatal ("[%p,%zd] is not mapped", hostaddrs[i], sizes[i]); + break; + + default: + assert (0); + break; + } + } +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/declare-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/declare-1.c new file mode 100644 index 0000000..8fbec4d --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/declare-1.c @@ -0,0 +1,122 @@ +/* { dg-do run { target openacc_nvidia_accel_selected } } */ + +#include <openacc.h> +#include <stdlib.h> +#include <stdio.h> + +#define N 8 + +void +subr2 (int *a) +{ + int i; + int f[N]; +#pragma acc declare copyout (f) + +#pragma acc parallel copy (a[0:N]) + { + for (i = 0; i < N; i++) + { + f[i] = a[i]; + a[i] = f[i] + f[i] + f[i]; + } + } +} + +void +subr1 (int *a) +{ + int f[N]; +#pragma acc declare copy (f) + +#pragma acc parallel copy (a[0:N]) + { + int i; + + for (i = 0; i < N; i++) + { + f[i] = a[i]; + a[i] = f[i] + f[i]; + } + } +} + +int b[8]; +#pragma acc declare create (b) + +int d[8] = { 1, 2, 3, 4, 5, 6, 7, 8 }; +#pragma acc declare copyin (d) + +int +main (int argc, char **argv) +{ + int a[N]; + int e[N]; +#pragma acc declare create (e) + int i; + + for (i = 0; i < N; i++) + a[i] = i + 1; + + if (!acc_is_present (&b, sizeof (b))) + abort (); + + if (!acc_is_present (&d, sizeof (d))) + abort (); + + if (!acc_is_present (&e, sizeof (e))) + abort (); + +#pragma acc parallel copyin (a[0:N]) + { + for (i = 0; i < N; i++) + { + b[i] = a[i]; + a[i] = b[i]; + } + } + + for (i = 0; i < N; i++) + { + if (a[i] != i + 1) + abort (); + } + +#pragma acc parallel copy (a[0:N]) + { + for (i = 0; i < N; i++) + { + e[i] = a[i] + d[i]; + a[i] = e[i]; + } + } + + for (i = 0; i < N; i++) + { + if (a[i] != (i + 1) * 2) + abort (); + } + + for (i = 0; i < N; i++) + { + a[i] = 1234; + } + + subr1 (&a[0]); + + for (i = 0; i < N; i++) + { + if (a[i] != 1234 * 2) + abort (); + } + + subr2 (&a[0]); + + for (i = 0; i < 1; i++) + { + if (a[i] != 1234 * 6) + abort (); + } + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/declare-5.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/declare-5.c new file mode 100644 index 0000000..1e2f6ce --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/declare-5.c @@ -0,0 +1,13 @@ +/* { dg-do run { target openacc_nvidia_accel_selected } } */ + +int +main (int argc, char **argv) +{ + int a[8] __attribute__((unused)); + + __builtin_printf ("CheCKpOInT\n"); +#pragma acc declare present (a) +} + +/* { dg-output "CheCKpOInT" } */ +/* { dg-shouldfail "" } */