On 29/10/2020 10:03 am, Jakub Jelinek wrote:
I'm actually not sure how this can work correctly.
Let's say we have
int foo () { return 1; }
int bar () { return 2; }
int baz () { return 3; }
int qux () { return 4; }
int a = foo ();
int b = bar ();
int c = baz ();
int *d = &c;
int e = qux ();
int f = e + 1;
int *g = &f;
#pragma omp declare target to (b, d, g)
So, for the implicit declare target discovery, a is not declare target to,
nor is foo, and everything else is; b, d, g explicitly, c because it is
referenced in initializer of b, f because it is mentioned in initializer of
g and e because it is mentioned in initializer of f.
Haven't checked if the new function you've added is called before or after
analyze_function calls omp_discover_implicit_declare_target, but I don't
really see how it can work when it is not inside of that function, so that
discovery of new static vars that are implicitly declare target to doesn't
result in marking of its dynamic initializers too. Perhaps we need a
langhook for that. But if it is a separate function, either it is called
before the other discovery and will ignore static initializers for vars
that will only be marked as implicit declare target to later, or it is done
afterwards, but then it would really need to duplicate everything what the
other function does, otherwise it woiuldn't discover everything.
I have added a new langhook GET_DECL_INIT that by default returns the
DECL_INITIAL of a variable declaration, but for C++ can also return the dynamic
initializer if present. omp_discover_implicit_declare_target and
omp_discover_declare_target_var_r have been changed to use the new langhook
instead of using DECL_INITIAL.
The dynamic initializer information is stored in a new variable
dynamic_initializers. The information is originally stored in static_aggregates,
but this is nulled by calling prune_vars_needing_no_initialization in
c_parse_final_cleanups. I copy the information into a separate variable before
it is discarded - this avoids any potential problems that may be caused by
trying to change the way that static_aggregates currently works.
With this, all the functions and variables in your example are marked correctly:
foo ()
...
__attribute__((omp declare target))
bar ()
...
__attribute__((omp declare target))
baz ()
...
__attribute__((omp declare target))
qux ()
...
.offload_var_table:
.quad g
.quad 8
.quad d
.quad 8
.quad b
.quad 4
.quad c
.quad 4
.quad f
.quad 4
.quad e
.quad 4
Your example is now a compile test in g++.dg/gomp/.
Anyway, that is one thing, the other is even if the implicit declare target
discovery handles those correctly, the question is what should we do
afterwards. Because the C++ FE normally creates a single function that
performs the dynamic initialization of the TUs variables. But that function
shouldn't be really declare target to, it initializes not only (explicit or
implicit) declare target to variables, but also host only variables.
So we'll probably need to create next to that host only TU constructor
also a device only constructor function that will only initialize the
declare target to variables.
Even without this patch, G++ currently accepts something like
int foo() { return 1; }
int x = foo();
#pragma omp declare target to(x)
but will not generate the device-side initializer for x, even though x is now
present on the device. So this part of the implementation is broken with or
without the patch.
Given that my patch doesn't make the current situation any worse, can I commit
this portion of it to trunk for now, and leave device-side dynamic
initialization for later?
Bootstrapped on x86_64 with no offloading, G++ testsuite ran with no
regressions, and no regressions in the libgomp testsuite with Nvidia offloading.
Thanks,
Kwok
From 0348b149474d0922d79209705e6777e7af271e0d Mon Sep 17 00:00:00 2001
From: Kwok Cheung Yeung <k...@codesourcery.com>
Date: Wed, 18 Nov 2020 13:54:01 -0800
Subject: [PATCH] openmp: Implicitly add 'declare target' directives for
dynamic initializers in C++
2020-11-18 Kwok Cheung Yeung <k...@codesourcery.com>
gcc/
* langhooks-def.h (lhd_get_decl_init): New.
(LANG_HOOKS_GET_DECL_INIT): New.
(LANG_HOOKS_DECLS): Add LANG_HOOKS_GET_DECL_INIT.
* langhooks.h (struct lang_hooks_for_decls): Add get_decl_init.
* omp-offload.c (omp_discover_declare_target_var_r): Use
get_decl_init langhook in place of DECL_INITIAL.
gcc/cp/
* cp-lang.c (cxx_get_decl_init): New.
(LANG_HOOKS_GET_DECL_INIT): New.
* cp-tree.h (dynamic_initializers): New.
* decl.c (dynamic_initializers): New.
* decl2.c (c_parse_final_cleanups): Copy vars into
dynamic_initializers.
gcc/testsuite/
* g++.dg/gomp/declare-target-3.C: New.
---
gcc/cp/cp-lang.c | 24 +++++++++++++++++++++
gcc/cp/cp-tree.h | 2 ++
gcc/cp/decl.c | 6 ++++++
gcc/cp/decl2.c | 5 +++++
gcc/langhooks-def.h | 5 ++++-
gcc/langhooks.c | 8 +++++++
gcc/langhooks.h | 5 +++++
gcc/omp-offload.c | 9 ++++----
gcc/testsuite/g++.dg/gomp/declare-target-3.C | 31 ++++++++++++++++++++++++++++
9 files changed, 90 insertions(+), 5 deletions(-)
create mode 100644 gcc/testsuite/g++.dg/gomp/declare-target-3.C
diff --git a/gcc/cp/cp-lang.c b/gcc/cp/cp-lang.c
index 9e980bc..44bf847 100644
--- a/gcc/cp/cp-lang.c
+++ b/gcc/cp/cp-lang.c
@@ -34,6 +34,7 @@ static tree cp_eh_personality (void);
static tree get_template_innermost_arguments_folded (const_tree);
static tree get_template_argument_pack_elems_folded (const_tree);
static tree cxx_enum_underlying_base_type (const_tree);
+static tree* cxx_get_decl_init (tree);
/* Lang hooks common to C++ and ObjC++ are declared in cp/cp-objcp-common.h;
consequently, there should be very few hooks below. */
@@ -86,6 +87,9 @@ static tree cxx_enum_underlying_base_type (const_tree);
#undef LANG_HOOKS_GET_SUBSTRING_LOCATION
#define LANG_HOOKS_GET_SUBSTRING_LOCATION c_get_substring_location
+#undef LANG_HOOKS_GET_DECL_INIT
+#define LANG_HOOKS_GET_DECL_INIT cxx_get_decl_init
+
/* Each front end provides its own lang hook initializer. */
struct lang_hooks lang_hooks = LANG_HOOKS_INITIALIZER;
@@ -227,6 +231,26 @@ tree cxx_enum_underlying_base_type (const_tree type)
return underlying_type;
}
+/* The C++ version of the get_decl_init langhook returns the static
+ initializer for a variable declaration if present, otherwise it
+ tries to find and return the dynamic initializer. If not present,
+ it returns NULL. */
+
+static tree*
+cxx_get_decl_init (tree decl)
+{
+ tree node;
+
+ if (DECL_INITIAL (decl))
+ return &DECL_INITIAL (decl);
+
+ for (node = dynamic_initializers; node; node = TREE_CHAIN (node))
+ if (TREE_VALUE (node) == decl)
+ return &TREE_PURPOSE (node);
+
+ return NULL;
+}
+
#if CHECKING_P
namespace selftest {
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 81485de..53277e1 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -5509,6 +5509,8 @@ extern GTY(()) tree static_aggregates;
/* Likewise, for thread local storage. */
extern GTY(()) tree tls_aggregates;
+extern GTY(()) tree dynamic_initializers;
+
enum overload_flags { NO_SPECIAL = 0, DTOR_FLAG, TYPENAME_FLAG };
/* These are uses as bits in flags passed to various functions to
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index d90e984..44558da 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -146,6 +146,12 @@ tree static_aggregates;
/* Like static_aggregates, but for thread_local variables. */
tree tls_aggregates;
+/* A list of all objects with initializers. Unlike static_aggegates,
+ this list is not cleared by the frontend. The decl is stored in
+ the TREE_VALUE slot and the initializer is stored in the
+ TREE_PURPOSE slot. */
+tree dynamic_initializers;
+
/* -- end of C++ */
/* A node for the integer constant 2. */
diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index 1bc7b7e..4fcf82a 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -4940,6 +4940,11 @@ c_parse_final_cleanups (void)
loop. */
vars = prune_vars_needing_no_initialization (&static_aggregates);
+ /* Copy the contents of VARS into DYNAMIC_INITIALIZERS. */
+ for (t = vars; t; t = TREE_CHAIN (t))
+ dynamic_initializers = tree_cons (TREE_PURPOSE (t), TREE_VALUE (t),
+ dynamic_initializers);
+
if (vars)
{
/* We need to start a new initialization function each time
diff --git a/gcc/langhooks-def.h b/gcc/langhooks-def.h
index 2f66f5e..d19fbee 100644
--- a/gcc/langhooks-def.h
+++ b/gcc/langhooks-def.h
@@ -87,6 +87,7 @@ extern void lhd_omp_firstprivatize_type_sizes (struct
gimplify_omp_ctx *,
tree);
extern bool lhd_omp_mappable_type (tree);
extern bool lhd_omp_scalar_p (tree);
+extern tree* lhd_get_decl_init (tree);
extern const char *lhd_get_substring_location (const substring_loc &,
location_t *out_loc);
@@ -265,6 +266,7 @@ extern tree lhd_unit_size_without_reusable_padding (tree);
#define LANG_HOOKS_OMP_CLAUSE_DTOR hook_tree_tree_tree_null
#define LANG_HOOKS_OMP_FINISH_CLAUSE lhd_omp_finish_clause
#define LANG_HOOKS_OMP_SCALAR_P lhd_omp_scalar_p
+#define LANG_HOOKS_GET_DECL_INIT lhd_get_decl_init
#define LANG_HOOKS_DECLS { \
LANG_HOOKS_GLOBAL_BINDINGS_P, \
@@ -293,7 +295,8 @@ extern tree lhd_unit_size_without_reusable_padding (tree);
LANG_HOOKS_OMP_CLAUSE_LINEAR_CTOR, \
LANG_HOOKS_OMP_CLAUSE_DTOR, \
LANG_HOOKS_OMP_FINISH_CLAUSE, \
- LANG_HOOKS_OMP_SCALAR_P \
+ LANG_HOOKS_OMP_SCALAR_P, \
+ LANG_HOOKS_GET_DECL_INIT \
}
/* LTO hooks. */
diff --git a/gcc/langhooks.c b/gcc/langhooks.c
index d82f542..6aa96bb 100644
--- a/gcc/langhooks.c
+++ b/gcc/langhooks.c
@@ -632,6 +632,14 @@ lhd_omp_scalar_p (tree decl)
return false;
}
+/* Return static initializer for DECL. */
+
+tree*
+lhd_get_decl_init (tree decl)
+{
+ return &DECL_INITIAL (decl);
+}
+
/* Register language specific type size variables as potentially OpenMP
firstprivate variables. */
diff --git a/gcc/langhooks.h b/gcc/langhooks.h
index f12589e..6cf2c59 100644
--- a/gcc/langhooks.h
+++ b/gcc/langhooks.h
@@ -299,6 +299,11 @@ struct lang_hooks_for_decls
/* Return true if DECL is a scalar variable (for the purpose of
implicit firstprivatization). */
bool (*omp_scalar_p) (tree decl);
+
+ /* Return a pointer to the tree representing the initializer
+ expression for the non-local variable DECL. Return NULL if
+ DECL is not initialized. */
+ tree* (*get_decl_init) (tree decl);
};
/* Language hooks related to LTO serialization. */
diff --git a/gcc/omp-offload.c b/gcc/omp-offload.c
index 9013961..d1c69ed 100644
--- a/gcc/omp-offload.c
+++ b/gcc/omp-offload.c
@@ -315,7 +315,7 @@ omp_discover_declare_target_var_r (tree *tp, int
*walk_subtrees, void *data)
DECL_ATTRIBUTES (*tp)
= remove_attribute ("omp declare target link", DECL_ATTRIBUTES
(*tp));
}
- if (TREE_STATIC (*tp) && DECL_INITIAL (*tp))
+ if (TREE_STATIC (*tp) && lang_hooks.decls.get_decl_init (*tp))
((vec<tree> *) data)->safe_push (*tp);
DECL_ATTRIBUTES (*tp) = tree_cons (id, NULL_TREE, DECL_ATTRIBUTES (*tp));
symtab_node *node = symtab_node::get (*tp);
@@ -361,14 +361,15 @@ omp_discover_implicit_declare_target (void)
&& DECL_STRUCT_FUNCTION (cgn->decl)->has_omp_target)
worklist.safe_push (cgn->decl);
}
- FOR_EACH_STATIC_INITIALIZER (vnode)
- if (omp_declare_target_var_p (vnode->decl))
+ FOR_EACH_VARIABLE (vnode)
+ if (lang_hooks.decls.get_decl_init (vnode->decl)
+ && omp_declare_target_var_p (vnode->decl))
worklist.safe_push (vnode->decl);
while (!worklist.is_empty ())
{
tree decl = worklist.pop ();
if (VAR_P (decl))
- walk_tree_without_duplicates (&DECL_INITIAL (decl),
+ walk_tree_without_duplicates (lang_hooks.decls.get_decl_init (decl),
omp_discover_declare_target_var_r,
&worklist);
else if (omp_declare_target_fn_p (decl))
diff --git a/gcc/testsuite/g++.dg/gomp/declare-target-3.C
b/gcc/testsuite/g++.dg/gomp/declare-target-3.C
new file mode 100644
index 0000000..8e9eafc
--- /dev/null
+++ b/gcc/testsuite/g++.dg/gomp/declare-target-3.C
@@ -0,0 +1,31 @@
+// { dg-do compile }
+// { dg-additional-options "-fdump-tree-gimple" }
+
+// Test implicit marking of declare target to.
+
+int foo () { return 1; }
+int bar () { return 2; } // Implicitly marked (due to b)
+int baz () { return 3; } // Implicitly marked (due to d via c)
+int qux () { return 4; } // Implicitly marked (due to g via f and e)
+
+int a = foo ();
+int b = bar (); // Explicitly marked
+int c = baz (); // Implicitly marked (due to d)
+int *d = &c; // Explicitly marked
+int e = qux (); // Implicitly marked (due to g via f)
+int f = e + 1; // Implicitly marked (due to g)
+int *g = &f; // Explicitly marked
+
+#pragma omp declare target to(b, d, g)
+
+// { dg-final { scan-tree-dump-not "__attribute__\\\(\\\(omp declare
target\\\)\\\)\\\nfoo \\\(\\\)" "gimple" } }
+// { dg-final { scan-tree-dump "__attribute__\\\(\\\(omp declare
target\\\)\\\)\\\nbar \\\(\\\)" "gimple" } }
+// { dg-final { scan-tree-dump "__attribute__\\\(\\\(omp declare
target\\\)\\\)\\\nbaz \\\(\\\)" "gimple" } }
+// { dg-final { scan-tree-dump "__attribute__\\\(\\\(omp declare
target\\\)\\\)\\\nqux \\\(\\\)" "gimple" } }
+// { dg-final { scan-assembler-not "\\\.offload_var_table:\\n.+\\\.quad\\s+a"
} }
+// { dg-final { scan-assembler "\\\.offload_var_table:\\n.+\\\.quad\\s+b" } }
+// { dg-final { scan-assembler "\\\.offload_var_table:\\n.+\\\.quad\\s+c" } }
+// { dg-final { scan-assembler "\\\.offload_var_table:\\n.+\\\.quad\\s+d" } }
+// { dg-final { scan-assembler "\\\.offload_var_table:\\n.+\\\.quad\\s+e" } }
+// { dg-final { scan-assembler "\\\.offload_var_table:\\n.+\\\.quad\\s+f" } }
+// { dg-final { scan-assembler "\\\.offload_var_table:\\n.+\\\.quad\\s+g" } }
--
2.8.1