On 12/08/2015 11:55 AM, Thomas Schwinge wrote: Just for clarification, we're implementing the bind clause with the semantics defined in OpenACC 2.5, correct? The 2.0a semantics aren't clear.
> On Sat, 14 Nov 2015 09:36:36 +0100, I wrote: >> Initial support for the OpenACC bind and nohost clauses (routine >> directive) for C, C++. Fortran to follow. Middle end handling and more >> complete testsuite coverage also to follow once we got a few details >> clarified. OK for trunk? > > (Has not yet been reviewed.) Meanwhile, I continued working on the > implementation, focussing on C. See also my question "How to rewrite > call targets (OpenACC bind clause)", > <http://news.gmane.org/find-root.php?message_id=%3C877fkq482i.fsf%40hertz.schwinge.homeip.net%3E>. > > To enable Cesar to help with the C++ and Fortran front ends (thanks!), in > r231423, I just committed "[WIP] OpenACC bind, nohost clauses" to > gomp-4_0-branch. (There has already been initial support, parsing only, > on gomp-4_0-branch.) I'll try to make progress with the generic middle > end bits, but will appreciate any review comments, so before inlining the > complete patch, first a few questions/comments: > > In the OpenACC bind(Y) clause attached to a routine(X) directive, Y can > be an identifier or a string. In the front ends, I canonicalize that > into a string, as we -- at least currently -- don't have any use for the > identifier (or decl?) later on: > > --- gcc/tree-core.h > +++ gcc/tree-core.h > @@ -461,7 +461,7 @@ enum omp_clause_code { > - /* OpenACC clause: bind ( identifer | string ). */ > + /* OpenACC clause: bind (string). */ > OMP_CLAUSE_BIND, So what happens in c++ then? E.g. Say that we have a function sum which is overloaded as follows: int sum (int a, int b) { return a + b; } double sum (double a, double b) { return a + b; } #pragma acc routine (sum) bind (cuda_sum) First of all, does this bind apply to both int sum and double sum, or just the double sum? Second, if the identifier gets canonicalized as a string, will that prevent the name from being mangled, and hence disable function overloading? Also, while I'm asking about c++, is it possible apply bind individually to an overloaded function. E.g. #pragma acc routine (sum) bind (cuda_sum_int) int sum (int a, int b) { return a + b; } #pragma acc routine (sum) bind (cuda_sum_double) double sum (double a, double b) { return a + b; } > All the following are unreachable for OMP_CLAUSE_BIND, OMP_CLAUSE_NOHOST; > document that to make it obvious/expected: > > --- gcc/cp/pt.c > +++ gcc/cp/pt.c > @@ -14501,6 +14501,8 @@ tsubst_omp_clauses (tree clauses, bool > declare_simd, bool allow_fields, > } > } > break; > + case OMP_CLAUSE_BIND: > + case OMP_CLAUSE_NOHOST: > default: > gcc_unreachable (); > } > --- gcc/gimplify.c > +++ gcc/gimplify.c > @@ -7413,6 +7413,8 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq > *pre_p, > ctx->default_kind = OMP_CLAUSE_DEFAULT_KIND (c); > break; > > + case OMP_CLAUSE_BIND: > + case OMP_CLAUSE_NOHOST: > default: > gcc_unreachable (); > } > @@ -8104,6 +8106,8 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, > gimple_seq body, tree *list_p, > case OMP_CLAUSE_DEVICE_TYPE: > break; > > + case OMP_CLAUSE_BIND: > + case OMP_CLAUSE_NOHOST: > default: > gcc_unreachable (); > } > --- gcc/omp-low.c > +++ gcc/omp-low.c > @@ -2279,6 +2279,8 @@ scan_sharing_clauses (tree clauses, omp_context > *ctx) > sorry ("Clause not supported yet"); > break; > > + case OMP_CLAUSE_BIND: > + case OMP_CLAUSE_NOHOST: > default: > gcc_unreachable (); > } > @@ -2453,6 +2455,8 @@ scan_sharing_clauses (tree clauses, omp_context > *ctx) > sorry ("Clause not supported yet"); > break; > > + case OMP_CLAUSE_BIND: > + case OMP_CLAUSE_NOHOST: > default: > gcc_unreachable (); > } > --- gcc/tree-nested.c > +++ gcc/tree-nested.c > @@ -1200,6 +1200,8 @@ convert_nonlocal_omp_clauses (tree *pclauses, > struct walk_stmt_info *wi) > case OMP_CLAUSE_SEQ: > break; > > + case OMP_CLAUSE_BIND: > + case OMP_CLAUSE_NOHOST: > default: > gcc_unreachable (); > } > @@ -1882,6 +1884,8 @@ convert_local_omp_clauses (tree *pclauses, struct > walk_stmt_info *wi) > case OMP_CLAUSE_SEQ: > break; > > + case OMP_CLAUSE_BIND: > + case OMP_CLAUSE_NOHOST: > default: > gcc_unreachable (); > } Those changes look reasonable. > C front end: > > --- gcc/c/c-parser.c > +++ gcc/c/c-parser.c > @@ -11607,6 +11607,8 @@ c_parser_oacc_clause_async (c_parser *parser, > tree list) > static tree > c_parser_oacc_clause_bind (c_parser *parser, tree list) > { > + check_no_duplicate_clause (list, OMP_CLAUSE_BIND, "bind"); > + > location_t loc = c_parser_peek_token (parser)->location; > > parser->lex_untranslated_string = true; > @@ -11615,20 +11617,43 @@ c_parser_oacc_clause_bind (c_parser *parser, > tree list) > parser->lex_untranslated_string = false; > return list; > } > - if (c_parser_next_token_is (parser, CPP_NAME) > - || c_parser_next_token_is (parser, CPP_STRING)) > + tree name = error_mark_node; > + c_token *token = c_parser_peek_token (parser); > + if (c_parser_next_token_is (parser, CPP_NAME)) > { > - tree t = c_parser_peek_token (parser)->value; > + tree decl = lookup_name (token->value); > + if (!decl) > + error_at (token->location, "%qE has not been declared", > + token->value); > + else if (TREE_CODE (decl) != FUNCTION_DECL) > + error_at (token->location, "%qE does not refer to a function", > + token->value); > > Quite possibly we'll want to add more error checking (matching signature > of X and Y, for example). Good idea, but I wonder if that would be too strict. Should we allow integer promotion in the bind function arguments? > + else > + { > + //TODO? TREE_USED (decl) = 1; > + tree name_id = DECL_NAME (decl); > + name = build_string (IDENTIFIER_LENGTH (name_id), > + IDENTIFIER_POINTER (name_id)); > + } > + c_parser_consume_token (parser); > + } > > Should I set TREE_USED after having looked up the identifier? > > + else if (c_parser_next_token_is (parser, CPP_STRING)) > + { > + name = token->value; > c_parser_consume_token (parser); > - tree c = build_omp_clause (loc, OMP_CLAUSE_BIND); > - OMP_CLAUSE_BIND_NAME (c) = t; > - OMP_CLAUSE_CHAIN (c) = list; > - list = c; > } > else > - c_parser_error (parser, "expected identifier or character string > literal"); > + c_parser_error (parser, > + "expected identifier or character string literal"); > parser->lex_untranslated_string = false; > c_parser_require (parser, CPP_CLOSE_PAREN, "expected %<)%>"); > + if (name != error_mark_node) > + { > + tree c = build_omp_clause (loc, OMP_CLAUSE_BIND); > + OMP_CLAUSE_BIND_NAME (c) = name; > + OMP_CLAUSE_CHAIN (c) = list; > + list = c; > + } > return list; > } > > @@ -13977,10 +14002,10 @@ static void > c_parser_oacc_routine (c_parser *parser, enum pragma_context context) > { > tree decl = NULL_TREE; > - /* Create a dummy claue, to record location. */ > + /* Create a dummy clause, to record the location. */ > tree c_head = build_omp_clause (c_parser_peek_token (parser)->location, > - OMP_CLAUSE_SEQ); > - > + OMP_CLAUSE_ERROR); > > I don't know why somebody chose OMP_CLAUSE_SEQ for this; changed to a > distinctive OMP_CLAUSE_ERROR. In the following, handling of c_head and > generally the clauses seemed unnecessarily complicated to me, so I > simplified that as follows: I think that was me. As the comment states, I was using a dummy clause to save the location for error reporting. OMP_CLAUSE_SEQ was chosen because it's default level of parallelism for routines. Your changes are ok though. > @@ -14018,9 +14043,9 @@ c_parser_oacc_routine (c_parser *parser, enum > pragma_context context) > tree clauses = c_parser_oacc_all_clauses > (parser, OACC_ROUTINE_CLAUSE_MASK, "#pragma acc routine", > OACC_ROUTINE_CLAUSE_DEVICE_TYPE_MASK); > - > - /* Force clauses to be non-null, by attaching context to it. */ > - clauses = tree_cons (c_head, clauses, NULL_TREE); > + /* Prepend the dummy clause. */ > + OMP_CLAUSE_CHAIN (c_head) = clauses; > + clauses = c_head; > > if (decl) > c_finish_oacc_routine (parser, decl, clauses, true, true, false); > @@ -14040,7 +14065,9 @@ static void > c_finish_oacc_routine (c_parser *ARG_UNUSED (parser), tree fndecl, > tree clauses, bool named, bool first, bool is_defn) > { > - location_t loc = OMP_CLAUSE_LOCATION (TREE_PURPOSE (clauses)); > + location_t loc = OMP_CLAUSE_LOCATION (clauses); > + /* Get rid of the dummy clause. */ > + clauses = OMP_CLAUSE_CHAIN (clauses); > > if (!fndecl || TREE_CODE (fndecl) != FUNCTION_DECL || !first) > { > @@ -14059,13 +14086,12 @@ c_finish_oacc_routine (c_parser *ARG_UNUSED > (parser), tree fndecl, > TREE_USED (fndecl) ? "use" : "definition"); > > /* Process for function attrib */ > - tree dims = build_oacc_routine_dims (TREE_VALUE (clauses)); > + tree dims = build_oacc_routine_dims (clauses); > replace_oacc_fn_attrib (fndecl, dims); > > - /* Also attach as a declare. */ > - DECL_ATTRIBUTES (fndecl) > - = tree_cons (get_identifier ("omp declare target"), > - clauses, DECL_ATTRIBUTES (fndecl)); > + /* Also add an "omp declare target" attribute, with clauses. */ > + DECL_ATTRIBUTES (fndecl) = tree_cons (get_identifier ("omp declare > target"), > + clauses, DECL_ATTRIBUTES > (fndecl)); > } > > I don't know why somebody chose to attach the clauses to the "omp declare > target" attribute in this way? Especially given that so far there hasn't > been any user of this information (I'm now adding such users). Is that > OK, or should we have a separate "omp clauses" attribute or similar? That was probably me again. When I started working on routine, I didn't think it was going to be necessary to have a separate attribute for acc routines. Then I added an acc routine attribute for something (forgot what exactly), but these routine clauses were never updated. I like the idea of having an "omp clauses" attribute. Especially since we're going to need to eventually chain a list of device_type clauses together. It's probably easier to access the clauses by pulling them from the "omp clauses" attribute. > Again simplifying the c_head/clauses handling (snipped), the C++ front > end changes are very similar to the C front end changes: > > --- gcc/cp/parser.c > +++ gcc/cp/parser.c > @@ -31539,42 +31538,76 @@ static tree > cp_parser_oacc_clause_bind (cp_parser *parser, tree list) > { > [...] > - if (cp_lexer_next_token_is (parser->lexer, CPP_NAME) > - || cp_lexer_next_token_is (parser->lexer, CPP_STRING)) > + tree name = error_mark_node; > + cp_token *token = cp_lexer_peek_token (parser->lexer); > + if (cp_lexer_next_token_is (parser->lexer, CPP_NAME)) > > I'm not particularly confident in the following lookup/error checking > (which I copied a lot from C++ OpenACC routine parsing): > > { > - tree t; > - > - if (cp_lexer_peek_token (parser->lexer)->type == CPP_STRING) > - { > - t = cp_lexer_peek_token (parser->lexer)->u.value; > - cp_lexer_consume_token (parser->lexer); > + //TODO > + tree id = cp_parser_id_expression (parser, /*template_p=*/false, > + /*check_dependency_p=*/true, > + /*template_p=*/NULL, > + /*declarator_p=*/false, > + /*optional_p=*/false); > + tree decl = cp_parser_lookup_name_simple (parser, id, > token->location); > + if (id != error_mark_node && decl == error_mark_node) > + cp_parser_name_lookup_error (parser, id, decl, NLE_NULL, > + token->location); > + if (/* TODO */ !decl || decl == error_mark_node) > + error_at (token->location, "%qE has not been declared", > + token->u.value); > + else if (/* TODO */ is_overloaded_fn (decl) > + && (TREE_CODE (decl) != FUNCTION_DECL > + || DECL_FUNCTION_TEMPLATE_P (decl))) > + error_at (token->location, "%qE names a set of overloads", > + token->u.value); > + else if (/* TODO */ !DECL_NAMESPACE_SCOPE_P (decl)) > + { > + /* Perhaps we should use the same rule as declarations in > different > + namespaces? */ > + error_at (token->location, > + "%qE does not refer to a namespace scope function", > + token->u.value); > } > + else if (TREE_CODE (decl) != FUNCTION_DECL) > + error_at (token->location, > + "%qE does not refer to a function", > + token->u.value); > > ... also we'll want to add a lot more testsuite coverage for this. (Also > for the OpenACC routine directive itself.) I'll look into this. > else > - t = cp_parser_id_expression (parser, /*template_p=*/false, > - /*check_dependency_p=*/true, > - /*template_p=*/NULL, > - /*declarator_p=*/false, > - /*optional_p=*/false); > - if (t == error_mark_node) > - return t; > - > - tree c = build_omp_clause (loc, OMP_CLAUSE_BIND); > - OMP_CLAUSE_BIND_NAME (c) = t; > - OMP_CLAUSE_CHAIN (c) = list; > - list = c; > + { > + //TODO? TREE_USED (decl) = 1; > + tree name_id = DECL_NAME (decl); > + name = build_string (IDENTIFIER_LENGTH (name_id), > + IDENTIFIER_POINTER (name_id)); > > We probably need to apply C++ name mangling here? How to do that? > > + } > + //cp_lexer_consume_token (parser->lexer); > + } > + else if (cp_lexer_next_token_is (parser->lexer, CPP_STRING)) > + { > + name = token->u.value; > + cp_lexer_consume_token (parser->lexer); > } > else > - cp_parser_error (parser, "expected identifier or character string > literal"); > + cp_parser_error (parser, > + "expected identifier or character string literal"); > parser->translate_strings_p = save_translate_strings_p; > cp_parser_require (parser, CPP_CLOSE_PAREN, RT_CLOSE_PAREN); > + if (name != error_mark_node) > + { > + tree c = build_omp_clause (loc, OMP_CLAUSE_BIND); > + OMP_CLAUSE_BIND_NAME (c) = name; > + OMP_CLAUSE_CHAIN (c) = list; > + list = c; > + } > return list; > } > > What I changed in the Fortran front end is just a quick hack. Also I > have not spent any effort on updating the existing OpenACC bind clause > support: the name is (only) parsed into routine_bind, but then not > handled any further? Also needs testsuite coverage, obviously. > > --- gcc/fortran/gfortran.h > +++ gcc/fortran/gfortran.h > @@ -850,6 +850,7 @@ typedef struct > > /* This is an OpenACC acclerator function at level N - 1 */ > unsigned oacc_function:3; > + unsigned oacc_function_nohost:1; > > /* Attributes set by compiler extensions (!GCC$ ATTRIBUTES). */ > unsigned ext_attr:EXT_ATTR_NUM; > --- gcc/fortran/openmp.c > +++ gcc/fortran/openmp.c > @@ -1884,6 +1884,8 @@ gfc_match_oacc_routine (void) > goto cleanup; > gfc_current_ns->proc_name->attr.oacc_function > = gfc_oacc_routine_dims (c) + 1; > + gfc_current_ns->proc_name->attr.oacc_function_nohost > + = c ? c->nohost : false; > } > > if (n) > --- gcc/fortran/trans-decl.c > +++ gcc/fortran/trans-decl.c > @@ -1309,8 +1309,13 @@ add_attributes_to_decl (symbol_attribute sym_attr, > tree list) > || sym_attr.oacc_declare_device_resident > #endif > ) > - list = tree_cons (get_identifier ("omp declare target"), > - NULL_TREE, list); > + { > + tree c = NULL_TREE; > + if (sym_attr.oacc_function_nohost) > + c = build_omp_clause (/* TODO */ input_location, > + OMP_CLAUSE_NOHOST); > + list = tree_cons (get_identifier ("omp declare target"), c, list); > + } > #if 0 /* TODO */ > if (sym_attr.oacc_declare_link) > list = tree_cons (get_identifier ("omp declare target link"), > > I guess add_attributes_to_decl is the correct place to be doning this? > > --- gcc/fortran/trans-openmp.c > +++ gcc/fortran/trans-openmp.c > @@ -2644,6 +2644,13 @@ gfc_trans_omp_clauses_1 (stmtblock_t *block, > gfc_omp_clauses *clauses, > OMP_CLAUSE_GANG_STATIC_EXPR (c) = arg; > } > } > + if (clauses->nohost) > + { > + c = build_omp_clause (where.lb->location, OMP_CLAUSE_NOHOST); > + omp_clauses = gfc_trans_add_clause (c, omp_clauses); > + //TODO > + gcc_unreachable(); > + } > > Probably we can generally just put a gcc_unreachable call here, with a > source code comment added. Again, this is to make sure that the reader > of that code doesn't wonder why "clauses->nohost" has been forgotten to > be handled here. > > return nreverse (omp_clauses); > } That'll go on my todo list too. > Middle end. In the LTO wrapper, at the end of read_cgraph_and_symbols, > for ACCEL_COMPILERs handle OpenACC bind clauses: > > --- gcc/lto/lto.c > +++ gcc/lto/lto.c > @@ -2942,6 +2944,36 @@ read_cgraph_and_symbols (unsigned nfiles, const > char **fnames) > > ggc_free (all_file_decl_data); > all_file_decl_data = NULL; > + > +#ifdef ACCEL_COMPILER > + /* In an offload compiler, redirect calls to any function X that is > tagged > + with an OpenACC bind(Y) clause to call Y instead of X. */ > + FOR_EACH_SYMBOL (snode) > + { > + tree decl = snode->decl; > + tree attr = lookup_attribute ("omp declare target", > + DECL_ATTRIBUTES (decl)); > + if (attr) > + { > + tree clauses = TREE_VALUE (attr); > + /* TODO: device_type handling. */ > + tree clause_bind = find_omp_clause (clauses, OMP_CLAUSE_BIND); > + if (clause_bind) > + { > + tree clause_bind_name = OMP_CLAUSE_BIND_NAME (clause_bind); > + const char *bind_name = TREE_STRING_POINTER(clause_bind_name); > + if (symtab->dump_file) > + fprintf (symtab->dump_file, > + "Applying \"bind(%s)\" clause to declaration of " > + "function \"%s\".\n", > + bind_name, IDENTIFIER_POINTER (DECL_NAME (decl))); > + //TODO: Use gcc/varasm.c:set_user_assembler_name instead? > + symtab->change_decl_assembler_name (decl, > + get_identifier > (bind_name)); > + } > + } > + } > +#endif /* ACCEL_COMPILER */ > } > > Probably that should be put into a separate function (in gcc/omp-low.c, > even?). Is the end of read_cgraph_and_symbols the correct place to > put/call this? Per my "How to rewrite call targets (OpenACC bind > clause)" email, > <http://news.gmane.org/find-root.php?message_id=%3C877fkq482i.fsf%40hertz.schwinge.homeip.net%3E>, > it's still not clear to me whether just setting the decl's assembler name > here is the right (and sufficient) thing to do (but it seems to work, > with -fno-inline at least...). I don't think the placement matters too much. It's a minor detail that can be changed later. > Joseph once pointed out that we'll need to add user_label_prefix to the > bind_name -- but only if an indentifier has been used for Y in the > bind(Y) clause, and not when a string has been used? > > Then, the following handling in execute_oacc_device_lower (correct > position in the pipeline -- as early as possible after the LTO front end, > I guess?), for ACCEL_COMPILERs handle OpenACC bind clauses, and for > non-ACCEL_COMPILERs handle OpenACC nohost clauses. In both cases, use > the new TODO_discard_function, > <http://news.gmane.org/find-root.php?message_id=%3C563A3791.7020001%40suse.cz%3E>, > that has recently been added. :-) > > --- gcc/omp-low.c > +++ gcc/omp-low.c > @@ -19853,14 +19857,76 @@ default_goacc_reduction (gcall *call) > static unsigned int > execute_oacc_device_lower () > { > - tree attrs = get_oacc_fn_attrib (current_function_decl); > - int dims[GOMP_DIM_MAX]; > - > - if (!attrs) > + /* There are offloaded functions without an "omp declare target" > attribute, > + so we'll not handle these here, but on the other hand, OpenACC bind > and > + nohost clauses can only be generated in the front ends, and an "omp > + declare target" attribute will then also always have been set > there, so > + this is not a problem in practice. */ > + tree attr = lookup_attribute ("omp declare target", > + DECL_ATTRIBUTES (current_function_decl)); > + > +#if defined(ACCEL_COMPILER) > + /* In an offload compiler, discard any offloaded function X that is > tagged > + with an OpenACC bind(Y) clause: all references to X have been > rewritten to > + refer to Y; X is unreachable, do not compile it. */ > + if (attr) > + { > + tree clauses = TREE_VALUE (attr); > + /* TODO: device_type handling. */ > + tree clause_bind = find_omp_clause (clauses, OMP_CLAUSE_BIND); > + if (clause_bind) > + { > + tree clause_bind_name = OMP_CLAUSE_BIND_NAME (clause_bind); > + const char *bind_name = TREE_STRING_POINTER(clause_bind_name); > + if (dump_file) > + fprintf (dump_file, > + "Discarding function \"%s\" with \"bind(%s)\" > clause.\n", > + IDENTIFIER_POINTER (DECL_NAME > (current_function_decl)), > + bind_name); > + TREE_ASM_WRITTEN (current_function_decl) = 1; > + return TODO_discard_function; > + } > + } > +#endif /* ACCEL_COMPILER */ > +#if !defined(ACCEL_COMPILER) > + /* In the host compiler, discard any offloaded function that is tagged > with > + an OpenACC nohost clause. */ > + if (attr) > + { > + tree clauses = TREE_VALUE (attr); > + if (find_omp_clause (clauses, OMP_CLAUSE_NOHOST)) > + { > + /* There are no construct/clause combinations that could make > this > + happen, but play it safe, and verify that we never discard a > + function that is stored in offload_funcs, used for > target/offload > + function mapping. */ > + if (flag_checking) > + { > + bool found = false; > + for (unsigned i = 0; > + !found && i < vec_safe_length (offload_funcs); > + i++) > + if ((*offload_funcs)[i] == current_function_decl) > + found = true; > + gcc_assert (!found); > + } > + > + if (dump_file) > + fprintf (dump_file, > + "Discarding function \"%s\" with \"nohost\" > clause.\n", > + IDENTIFIER_POINTER (DECL_NAME > (current_function_decl))); > + TREE_ASM_WRITTEN (current_function_decl) = 1; > + return TODO_discard_function; I don't think this is a good idea. If you have a nohost function, wounldn't that prevent the code from linking? Perhaps nohost should kind of implement a reverse bind on the host. E.g. discard the function defintion and replace it with an asm alias to some libgomp function like goacc_nohost_fallback. That way, the program will still link and the runtime will provide the end user with a sensible error when things go wrong. > + } > + } > +#endif /* !ACCEL_COMPILER */ > + > + attr = get_oacc_fn_attrib (current_function_decl); > + if (!attr) > /* Not an offloaded function. */ > return 0; > - > - int fn_level = oacc_validate_dims (current_function_decl, attrs, dims); > + int dims[GOMP_DIM_MAX]; > + int fn_level = oacc_validate_dims (current_function_decl, attr, dims); > > /* Discover, partition and process the loops. */ > oacc_loop *loops = oacc_loop_discovery (); > > Initial testsuite updates: > > --- gcc/testsuite/c-c++-common/goacc/routine-2.c > +++ gcc/testsuite/c-c++-common/goacc/routine-2.c > @@ -1,21 +1,40 @@ > +/* Test invalid use of clauses with routine. */ > [...] > +extern void a(void), b(void); > + > +#pragma acc routine bind(a) bind(b) /* { dg-error "too many .bind. > clauses" } */ > +extern void bind_1 (void); > > This diagnostic does make sense (can't bind to a and b at the same time), > but this will need re-visiting for device_type clause support. > > +#pragma acc routine nohost nohost /* { dg-error "too many .nohost. > clauses" } */ > +extern void nohost (void); > > But I'm not too sure about this one. After all, there is no harm in > specifying multiple such clauses. However, GCC generally (also for > "simple" OpenMP clauses?) seems to diagnose such usage, so it's probably > a good idea to be consistent? I think so. If the user wants to duplicate nohost, then nohost should go into a device_type. > --- /dev/null > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/routine-bind-nohost-1.c > @@ -0,0 +1,105 @@ > +/* Test the bind and nohost clauses for OpenACC routine directive. */ > + > +/* TODO. Function inlining and the OpenACC bind clause do not yet get > on well > + with one another. > + { dg-additional-options "-fno-inline" } */ > > TODO. > > +/* TODO. C works, but for C++ we get: "lto1: internal compiler error: in > + ipa_propagate_frequency". > + { dg-xfail-if "TODO" { *-*-* } } */ > > TODO. Perhaps related to missing C++ name mangling (see above), perhaps > something else. > > +#include <openacc.h> > + > +/* "MINUS_TWO" is the device variant for function "TWO". Similar for > "THREE", > + and "FOUR". Exercising different variants for declaring routines. */ > + > +#pragma acc routine nohost > +extern int MINUS_TWO(void); > + > +int MINUS_TWO(void) > +{ > + if (!acc_on_device(acc_device_not_host)) > + __builtin_abort(); > + return -2; > +} > + > +extern int TWO(void); > +#pragma acc routine (TWO) bind(MINUS_TWO) > + > +int TWO(void) > +{ > + if (acc_on_device(acc_device_not_host)) > + __builtin_abort(); > + return 2; > +} > + > + > +#pragma acc routine nohost > +int MINUS_THREE(void) > +{ > + if (!acc_on_device(acc_device_not_host)) > + __builtin_abort(); > + return -3; > +} > + > +#pragma acc routine bind(MINUS_THREE) > +extern int THREE(void); > + > +int THREE(void) > +{ > + if (acc_on_device(acc_device_not_host)) > + __builtin_abort(); > + return 3; > +} > + > + > +/* Due to using a string in the bind clause, we don't need "MINUS_FOUR" > in > + scope here. */ > +#pragma acc routine bind("MINUS_FOUR") > +int FOUR(void) > +{ > + if (acc_on_device(acc_device_not_host)) > + __builtin_abort(); > + return 4; > +} > + > +extern int MINUS_FOUR(void); > +#pragma acc routine (MINUS_FOUR) nohost > + > +int MINUS_FOUR(void) > +{ > + if (!acc_on_device(acc_device_not_host)) > + __builtin_abort(); > + return -4; > +} > + > + > +int main() > +{ > + int x2, x3, x4; > + > +#pragma acc parallel copyout(x2, x3, x4) if(0) > + { > + x2 = TWO(); > + x3 = THREE(); > + x4 = FOUR(); > + } > + if (x2 != 2 || x3 != 3 || x4 != 4) > + __builtin_abort(); > + > +#pragma acc parallel copyout(x2, x3, x4) > + { > + x2 = TWO(); > + x3 = THREE(); > + x4 = FOUR(); > + } > +#ifdef ACC_DEVICE_TYPE_host > + if (x2 != 2 || x3 != 3 || x4 != 4) > + __builtin_abort(); > +#else > + if (x2 != -2 || x3 != -3 || x4 != -4) > + __builtin_abort(); > +#endif > + > + return 0; > +} > > I'd also like to add test cases where the host and device function > definitions are in separate files, so I'll try to figure out how to do > that in the libgomp testsuite. I thought we're using lto, so being in separate files doens't really matter in the end. > --- /dev/null > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/routine-nohost-1.c > @@ -0,0 +1,18 @@ > +/* { dg-do link } */ > + > +extern int three (void); > + > +#pragma acc routine (three) nohost > +__attribute__((noinline)) > +int three(void) > +{ > + return 3; > +} > + > +int main(void) > +{ > + return (three() == 3) ? 0 : 1; > +} > + > +/* Expecting link to fail; "undefined reference to `three'" (or similar). > + { dg-excess-errors "" } */ > > This results in an XFAIL, which is not nice. Is there a mechanism in the > GCC testsuite/DejaGnu to check for an expected link failure (due to a > missing symbol)? I guess we could cook up something that instead > triggers a link failure for a duplicate or incompatible symbol > definition? This is an interesting test case. So what's supposed to happen if a nohost routine is called outside of an acc context? Should it still work or not? As mentioned above, I don't think there should be a missing symbol error. Maybe check for a "LIBGOMP: invalid call to nohost function". > --- libgomp/testsuite/libgomp.oacc-fortran/routine-6.f90 > +++ libgomp/testsuite/libgomp.oacc-fortran/routine-6.f90 > @@ -1,5 +1,5 @@ > ! { dg-do run } > -! { dg-xfail-if "not found" { openacc_host_selected } } > +! { dg-xfail-if "TODO" { *-*-* } } > > TODO. ICE, if I remember correctly. Cesar