[ was: Re: [PATCH, 10/16] Add pass_oacc_kernels pass group in passes.def ]
Hi,
Consider test-case test.c, with a use of the final value of the
iteration variable (return i):
...
unsigned int
foo (int *a, unsigned int n)
{
unsigned int i;
for (i = 0; i < n; ++i)
a[i] = 1;
return i;
}
...
Compiled with:
...
$ gcc -S -O2 test.c -ftree-parallelize-loops=2 -fdump-tree-all-details
...
Before parloops, we have:
...
<bb 4>:
# i_12 = PHI <0(3), i_10(5)>
_5 = (long unsigned int) i_12;
_6 = _5 * 4;
_8 = a_7(D) + _6;
*_8 = 1;
i_10 = i_12 + 1;
if (n_4(D) > i_10)
goto <bb 5>;
else
goto <bb 6>;
<bb 5>:
goto <bb 4>;
<bb 6>:
# i_14 = PHI <n_4(D)(4), 0(2)>
...
Parloops will fail because:
...
phi is n_2 = PHI <n_4(D)(4)>
arg of phi to exit: value n_4(D) used outside loop
checking if it a part of reduction pattern:
FAILED: it is not a part of reduction....
...
[ note that the phi looks slightly different. In
gather_scalar_reductions -> vect_analyze_loop_form ->
vect_analyze_loop_form_1 -> split_loop_exit_edge we split the edge from
bb4 to bb6. ]
This patch uses scev_const_prop at the start of parloops.
scev_const_prop first also splits the exit edge, and then replaces the
phi with a assignment:
...
final value replacement:
n_2 = PHI <n_4(D)(4)>
with
n_2 = n_4(D);
...
This allows parloops to succeed.
And there's a similar story when we compile with -fno-tree-scev-cprop in
addition.
Bootstrapped and reg-tested on x86_64.
OK for stage3/stage1?
Thanks,
- Tom
Call scev_const_prop in pass_parallelize_loops::execute
2015-11-17 Tom de Vries <t...@codesourcery.com>
PR tree-optimization/68373
* tree-parloops.c (pass_parallelize_loops::execute): Call
scev_const_prop.
* gcc.dg/autopar/pr68373.c: New test.
---
gcc/testsuite/gcc.dg/autopar/pr68373.c | 14 ++++++++++++++
gcc/tree-parloops.c | 3 +++
2 files changed, 17 insertions(+)
diff --git a/gcc/testsuite/gcc.dg/autopar/pr68373.c b/gcc/testsuite/gcc.dg/autopar/pr68373.c
new file mode 100644
index 0000000..8e0f8a5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/autopar/pr68373.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-parallelize-loops=2 -fdump-tree-parloops-details" } */
+
+unsigned int
+foo (int *a, unsigned int n)
+{
+ unsigned int i;
+ for (i = 0; i < n; ++i)
+ a[i] = 1;
+
+ return i;
+}
+
+/* { dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 1 "parloops" } } */
diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c
index 17415a8..d944395 100644
--- a/gcc/tree-parloops.c
+++ b/gcc/tree-parloops.c
@@ -2787,6 +2787,9 @@ pass_parallelize_loops::execute (function *fun)
if (number_of_loops (fun) <= 1)
return 0;
+ unsigned int sccp_todo = scev_const_prop ();
+ gcc_assert (sccp_todo == 0);
+
if (parallelize_loops ())
{
fun->curr_properties &= ~(PROP_gimple_eomp);