On 08/02/16 11:54, Richard Biener wrote:
On Mon, 8 Feb 2016, Tom de Vries wrote:
Hi,
when compiling the fipa-pta tests in the libgomp testsuite (omp-nested-2.c,
pr46032.c) with -flto -flto-partition=max, the tests fail in execution
(PR69599).
The problem is related to the GOMP/GOACC_parallel optimization we do in
fipa-pta, where we interpret a call GOMP_parallel (&foo._0, data) as a call
foo._0 (data).
The problem is that this optimization is only legal in lto if both:
- foo containing the call GOMP_parallel (&foo._0, data) and
- foo._0
are contained in the same partition.
In the case of -flto-partition=max, foo is contained in it's own partition,
and foo._0 is contained in another partition. This means the data argument to
the GOMP_parallel call appears unused, and the setting of the argument is
optimized away, which causes the execution failure.
This patch fixes that by testing if foo and foo._0 are part of the same
partition.
[ Note that the node_address_taken change in the patch has no effect, since
nonlocal_p already tests for used_from_other_partition. But I thought it was
clearer to state the conditions under which we are allowed to ignore
node->address_taken explicitly. ]
Bootstrapped and reg-tested on x86_64.
Build for nvidia accelerator and reg-tested libgomp with various lto settings.
OK for trunk, stage4?
I don't like the in_lto_p checks, why's the check not working
for non-LTO?
I was not sure if the partition flags were valid outside lto.
Updated patch removes the in_lto_p checks.
Bootstrapped on x86_64.
Build and reg-tested libgomp testsuite.
OK?
Thanks,
- Tom
Thanks,
- Tom
Fix GOMP/GOACC_parallel optimization in ipa-pta
2016-02-08 Tom de Vries <t...@codesourcery.com>
PR tree-optimization/69599
* tree-ssa-structalias.c (fndecl_maybe_in_other_partition): New
function.
(find_func_aliases_for_builtin_call, find_func_clobbers)
(ipa_pta_execute): Handle case that foo and foo._0 are not in same lto
partition.
* testsuite/libgomp.c/omp-nested-3.c: New test.
* testsuite/libgomp.c/pr46032-2.c: New test.
* testsuite/libgomp.oacc-c-c++-common/kernels-2.c: New test.
* testsuite/libgomp.oacc-c-c++-common/parallel-2.c: New test.
---
gcc/tree-ssa-structalias.c | 49 ++++++++++++++++++----
libgomp/testsuite/libgomp.c/omp-nested-3.c | 4 ++
libgomp/testsuite/libgomp.c/pr46032-2.c | 4 ++
.../libgomp.oacc-c-c++-common/kernels-2.c | 4 ++
.../libgomp.oacc-c-c++-common/parallel-2.c | 4 ++
5 files changed, 56 insertions(+), 9 deletions(-)
diff --git a/gcc/tree-ssa-structalias.c b/gcc/tree-ssa-structalias.c
index e7d0797..d7a7dc5 100644
--- a/gcc/tree-ssa-structalias.c
+++ b/gcc/tree-ssa-structalias.c
@@ -4162,6 +4162,18 @@ find_func_aliases_for_call_arg (varinfo_t fi, unsigned index, tree arg)
process_constraint (new_constraint (lhs, *rhsp));
}
+/* Return true if FNDECL may be part of another lto partition. */
+
+static bool
+fndecl_maybe_in_other_partition (tree fndecl)
+{
+ cgraph_node *fn_node = cgraph_node::get (fndecl);
+ if (fn_node == NULL)
+ return true;
+
+ return fn_node->in_other_partition;
+}
+
/* Create constraints for the builtin call T. Return true if the call
was handled, otherwise false. */
@@ -4537,6 +4549,10 @@ find_func_aliases_for_builtin_call (struct function *fn, gcall *t)
tree fnarg = gimple_call_arg (t, fnpos);
gcc_assert (TREE_CODE (fnarg) == ADDR_EXPR);
tree fndecl = TREE_OPERAND (fnarg, 0);
+ if (fndecl_maybe_in_other_partition (fndecl))
+ /* Fallthru to general call handling. */
+ break;
+
tree arg = gimple_call_arg (t, argpos);
varinfo_t fi = get_vi_for_tree (fndecl);
@@ -5113,6 +5129,10 @@ find_func_clobbers (struct function *fn, gimple *origt)
tree fnarg = gimple_call_arg (t, fnpos);
gcc_assert (TREE_CODE (fnarg) == ADDR_EXPR);
tree fndecl = TREE_OPERAND (fnarg, 0);
+ if (fndecl_maybe_in_other_partition (fndecl))
+ /* Fallthru to general call handling. */
+ break;
+
varinfo_t cfi = get_vi_for_tree (fndecl);
tree arg = gimple_call_arg (t, argpos);
@@ -7505,9 +7525,13 @@ ipa_pta_execute (void)
address_taken bit for function foo._0, which would make it non-local.
But for the purpose of ipa-pta, we can regard the run_on_threads call
as a local call foo._0 (data), so we ignore address_taken on nodes
- with parallelized_function set. */
- bool node_address_taken = (node->address_taken
- && !node->parallelized_function);
+ with parallelized_function set.
+ Note: this is only safe, if foo and foo._0 are in the same lto
+ partition. */
+ bool node_address_taken = ((node->parallelized_function
+ && !node->used_from_other_partition)
+ ? false
+ : node->address_taken);
/* For externally visible or attribute used annotated functions use
local constraints for their arguments.
@@ -7676,12 +7700,19 @@ ipa_pta_execute (void)
continue;
/* Handle direct calls to functions with body. */
- if (gimple_call_builtin_p (stmt, BUILT_IN_GOMP_PARALLEL))
- decl = TREE_OPERAND (gimple_call_arg (stmt, 0), 0);
- else if (gimple_call_builtin_p (stmt, BUILT_IN_GOACC_PARALLEL))
- decl = TREE_OPERAND (gimple_call_arg (stmt, 1), 0);
- else
- decl = gimple_call_fndecl (stmt);
+ decl = gimple_call_fndecl (stmt);
+
+ {
+ tree called_decl = NULL_TREE;
+ if (gimple_call_builtin_p (stmt, BUILT_IN_GOMP_PARALLEL))
+ called_decl = TREE_OPERAND (gimple_call_arg (stmt, 0), 0);
+ else if (gimple_call_builtin_p (stmt, BUILT_IN_GOACC_PARALLEL))
+ called_decl = TREE_OPERAND (gimple_call_arg (stmt, 1), 0);
+
+ if (called_decl != NULL_TREE
+ && !fndecl_maybe_in_other_partition (called_decl))
+ decl = called_decl;
+ }
if (decl
&& (fi = lookup_vi_for_tree (decl))
diff --git a/libgomp/testsuite/libgomp.c/omp-nested-3.c b/libgomp/testsuite/libgomp.c/omp-nested-3.c
new file mode 100644
index 0000000..7790c58
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/omp-nested-3.c
@@ -0,0 +1,4 @@
+// { dg-do run { target lto } }
+// { dg-additional-options "-fipa-pta -flto -flto-partition=max" }
+
+#include "omp-nested-1.c"
diff --git a/libgomp/testsuite/libgomp.c/pr46032-2.c b/libgomp/testsuite/libgomp.c/pr46032-2.c
new file mode 100644
index 0000000..1125f6e
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/pr46032-2.c
@@ -0,0 +1,4 @@
+/* { dg-do run { target lto } } */
+/* { dg-options "-O2 -ftree-vectorize -std=c99 -fipa-pta -flto -flto-partition=max" } */
+
+#include "pr46032.c"
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-2.c
new file mode 100644
index 0000000..f76c926
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-2.c
@@ -0,0 +1,4 @@
+/* { dg-do run { target lto } } */
+/* { dg-additional-options "-fipa-pta -flto -flto-partition=max" } */
+
+#include "kernels-1.c"
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-2.c
new file mode 100644
index 0000000..d9fff6f
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-2.c
@@ -0,0 +1,4 @@
+/* { dg-do run { target lto } } */
+/* { dg-additional-options "-fipa-pta -flto -flto-partition=max" } */
+
+#include "parallel-1.c"