On 08/02/16 11:54, Richard Biener wrote:
On Mon, 8 Feb 2016, Tom de Vries wrote:

Hi,

when compiling the fipa-pta tests in the libgomp testsuite (omp-nested-2.c,
pr46032.c) with -flto -flto-partition=max, the tests fail in execution
(PR69599).

The problem is related to the GOMP/GOACC_parallel optimization we do in
fipa-pta, where we interpret a call GOMP_parallel (&foo._0, data) as a call
foo._0 (data).

The problem is that this optimization is only legal in lto if both:
- foo containing the call GOMP_parallel (&foo._0, data) and
- foo._0
are contained in the same partition.

In the case of -flto-partition=max, foo is contained in it's own partition,
and foo._0 is contained in another partition.  This means the data argument to
the GOMP_parallel call appears unused, and the setting of the argument is
optimized away, which causes the execution failure.

This patch fixes that by testing if foo and foo._0 are part of the same
partition.

[ Note that the node_address_taken change in the patch has no effect, since
nonlocal_p already tests for used_from_other_partition. But I thought it was
clearer to state the conditions under which we are allowed to ignore
node->address_taken explicitly. ]

Bootstrapped and reg-tested on x86_64.

Build for nvidia accelerator and reg-tested libgomp with various lto settings.

OK for trunk, stage4?

I don't like the in_lto_p checks, why's the check not working
for non-LTO?


I was not sure if the partition flags were valid outside lto.

Updated patch removes the in_lto_p checks.

Bootstrapped on x86_64.

Build and reg-tested libgomp testsuite.

OK?

Thanks,
- Tom

Thanks,
- Tom

Fix GOMP/GOACC_parallel optimization in ipa-pta

2016-02-08  Tom de Vries  <t...@codesourcery.com>

	PR tree-optimization/69599
	* tree-ssa-structalias.c (fndecl_maybe_in_other_partition): New
	function.
	(find_func_aliases_for_builtin_call, find_func_clobbers)
	(ipa_pta_execute):  Handle case that foo and foo._0 are not in same lto
	partition.

	* testsuite/libgomp.c/omp-nested-3.c: New test.
	* testsuite/libgomp.c/pr46032-2.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/kernels-2.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/parallel-2.c: New test.

---
 gcc/tree-ssa-structalias.c                         | 49 ++++++++++++++++++----
 libgomp/testsuite/libgomp.c/omp-nested-3.c         |  4 ++
 libgomp/testsuite/libgomp.c/pr46032-2.c            |  4 ++
 .../libgomp.oacc-c-c++-common/kernels-2.c          |  4 ++
 .../libgomp.oacc-c-c++-common/parallel-2.c         |  4 ++
 5 files changed, 56 insertions(+), 9 deletions(-)

diff --git a/gcc/tree-ssa-structalias.c b/gcc/tree-ssa-structalias.c
index e7d0797..d7a7dc5 100644
--- a/gcc/tree-ssa-structalias.c
+++ b/gcc/tree-ssa-structalias.c
@@ -4162,6 +4162,18 @@ find_func_aliases_for_call_arg (varinfo_t fi, unsigned index, tree arg)
     process_constraint (new_constraint (lhs, *rhsp));
 }
 
+/* Return true if FNDECL may be part of another lto partition.  */
+
+static bool
+fndecl_maybe_in_other_partition (tree fndecl)
+{
+  cgraph_node *fn_node = cgraph_node::get (fndecl);
+  if (fn_node == NULL)
+    return true;
+
+  return fn_node->in_other_partition;
+}
+
 /* Create constraints for the builtin call T.  Return true if the call
    was handled, otherwise false.  */
 
@@ -4537,6 +4549,10 @@ find_func_aliases_for_builtin_call (struct function *fn, gcall *t)
 	      tree fnarg = gimple_call_arg (t, fnpos);
 	      gcc_assert (TREE_CODE (fnarg) == ADDR_EXPR);
 	      tree fndecl = TREE_OPERAND (fnarg, 0);
+	      if (fndecl_maybe_in_other_partition (fndecl))
+		/* Fallthru to general call handling.  */
+		break;
+
 	      tree arg = gimple_call_arg (t, argpos);
 
 	      varinfo_t fi = get_vi_for_tree (fndecl);
@@ -5113,6 +5129,10 @@ find_func_clobbers (struct function *fn, gimple *origt)
 	      tree fnarg = gimple_call_arg (t, fnpos);
 	      gcc_assert (TREE_CODE (fnarg) == ADDR_EXPR);
 	      tree fndecl = TREE_OPERAND (fnarg, 0);
+	      if (fndecl_maybe_in_other_partition (fndecl))
+		/* Fallthru to general call handling.  */
+		break;
+
 	      varinfo_t cfi = get_vi_for_tree (fndecl);
 
 	      tree arg = gimple_call_arg (t, argpos);
@@ -7505,9 +7525,13 @@ ipa_pta_execute (void)
 	 address_taken bit for function foo._0, which would make it non-local.
 	 But for the purpose of ipa-pta, we can regard the run_on_threads call
 	 as a local call foo._0 (data),  so we ignore address_taken on nodes
-	 with parallelized_function set.  */
-      bool node_address_taken = (node->address_taken
-				 && !node->parallelized_function);
+	 with parallelized_function set.
+	 Note: this is only safe, if foo and foo._0 are in the same lto
+	 partition.  */
+      bool node_address_taken = ((node->parallelized_function
+				  && !node->used_from_other_partition)
+				 ? false
+				 : node->address_taken);
 
       /* For externally visible or attribute used annotated functions use
 	 local constraints for their arguments.
@@ -7676,12 +7700,19 @@ ipa_pta_execute (void)
 		continue;
 
 	      /* Handle direct calls to functions with body.  */
-	      if (gimple_call_builtin_p (stmt, BUILT_IN_GOMP_PARALLEL))
-		decl = TREE_OPERAND (gimple_call_arg (stmt, 0), 0);
-	      else if (gimple_call_builtin_p (stmt, BUILT_IN_GOACC_PARALLEL))
-		decl = TREE_OPERAND (gimple_call_arg (stmt, 1), 0);
-	      else
-		decl = gimple_call_fndecl (stmt);
+	      decl = gimple_call_fndecl (stmt);
+
+	      {
+		tree called_decl = NULL_TREE;
+		if (gimple_call_builtin_p (stmt, BUILT_IN_GOMP_PARALLEL))
+		  called_decl = TREE_OPERAND (gimple_call_arg (stmt, 0), 0);
+		else if (gimple_call_builtin_p (stmt, BUILT_IN_GOACC_PARALLEL))
+		  called_decl = TREE_OPERAND (gimple_call_arg (stmt, 1), 0);
+
+		if (called_decl != NULL_TREE
+		    && !fndecl_maybe_in_other_partition (called_decl))
+		  decl = called_decl;
+	      }
 
 	      if (decl
 		  && (fi = lookup_vi_for_tree (decl))
diff --git a/libgomp/testsuite/libgomp.c/omp-nested-3.c b/libgomp/testsuite/libgomp.c/omp-nested-3.c
new file mode 100644
index 0000000..7790c58
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/omp-nested-3.c
@@ -0,0 +1,4 @@
+// { dg-do run { target lto } }
+// { dg-additional-options "-fipa-pta -flto -flto-partition=max" }
+
+#include "omp-nested-1.c"
diff --git a/libgomp/testsuite/libgomp.c/pr46032-2.c b/libgomp/testsuite/libgomp.c/pr46032-2.c
new file mode 100644
index 0000000..1125f6e
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/pr46032-2.c
@@ -0,0 +1,4 @@
+/* { dg-do run { target lto } } */
+/* { dg-options "-O2 -ftree-vectorize -std=c99 -fipa-pta -flto -flto-partition=max" } */
+
+#include "pr46032.c"
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-2.c
new file mode 100644
index 0000000..f76c926
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-2.c
@@ -0,0 +1,4 @@
+/* { dg-do run { target lto } } */
+/* { dg-additional-options "-fipa-pta -flto -flto-partition=max" } */
+
+#include "kernels-1.c"
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-2.c
new file mode 100644
index 0000000..d9fff6f
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/parallel-2.c
@@ -0,0 +1,4 @@
+/* { dg-do run { target lto } } */
+/* { dg-additional-options "-fipa-pta -flto -flto-partition=max" } */
+
+#include "parallel-1.c"

Reply via email to