Hi folks.I have a pending patch for the path solver that pulls in global ranges when available (the stuff in SSA_NAME_RANGE_INFO). In doing so, I have run into a regression I was hoping the loop experts could pontificate on.
The regression comes from the simple_reduc_1() function in tree-ssa/loop-interchange-9.c, and it involves the following path:
=========== BB 5 ============ Imports: n_10(D) j_19 Exports: n_10(D) j_13 j_19 j_13 : j_19(I) n_10(D) int [1, 257] j_19 int [0, 256] Relational : (j_13 > j_19) <bb 5> [local count: 118111600]: # sum_21 = PHI <sum_14(4), sum_11(3)> c[j_19] = sum_21; j_13 = j_19 + 1; if (n_10(D) > j_13) goto <bb 9>; [89.00%] else goto <bb 6>; [11.00%] j_13 : int [1, 257] 5->9 (T) n_10(D) : int [2, 257] 5->9 (T) j_13 : int [1, 256] 5->9 (T) j_19 : int [0, 255] 5->6 (F) n_10(D) : int [1, 257] 5->6 (F) j_13 : int [1, 257] 5->6 (F) j_19 : int [0, 256] =========== BB 9 ============ n_10(D) int [2, 257] j_13 int [1, 256] Relational : (n_10(D) > j_19) Relational : (n_10(D) > j_13) <bb 9> [local count: 105119324]: goto <bb 3>; [100.00%] =========== BB 3 ============ Imports: n_10(D) Exports: n_10(D) n_10(D) int [1, +INF] <bb 3> [local count: 118111600]: # j_19 = PHI <j_13(9), 0(7)> sum_11 = c[j_19]; if (n_10(D) > 0) goto <bb 8>; [89.00%] else goto <bb 5>; [11.00%]With knowledge from previous passes (SSA_NAME_RANGE_INFO), we know that the loop cannot legally iterate outside the size of c[256]. So, j_13 lies within [1, 257] and n_10 is [2, +INF] at the end of the path. This allows us to thread BB3 to BB8. Doing so, causes the test to fail here:
/* { dg-final { scan-tree-dump-times "Loop_pair<outer:., inner:.> is interchanged" 1 "linterchange" } } */
All the blocks lie within the same loop, and the path passes all the tests in path_profitable_p().
Is there something about the shape of this path that should make it invalid in the backward threader, or should the test be marked with -fdisable-tree-threadN (or something else entirely)? Note that the backward threader is allowed to peel through loop headers.
I am attaching the gimple dump as well as the graphviz output for easier analysis.
Thanks. Aldy
__attribute__((noinline)) void simple_reduc_1 (int n) { int i; int sum; int j; int _1; int _2; int _3; <bb 2> [local count: 14598063]: if (n_10(D) > 0) goto <bb 7>; [89.00%] else goto <bb 6>; [11.00%] <bb 7> [local count: 12992276]: <bb 3> [local count: 118111600]: # j_19 = PHI <j_13(9), 0(7)> sum_11 = c[j_19]; if (n_10(D) > 0) goto <bb 8>; [89.00%] else goto <bb 5>; [11.00%] <bb 8> [local count: 105119324]: <bb 4> [local count: 955630225]: # sum_20 = PHI <sum_14(10), sum_11(8)> # i_22 = PHI <i_15(10), 0(8)> _1 = a[i_22][j_19]; _2 = b[i_22][j_19]; _3 = _1 * _2; sum_14 = _3 + sum_20; i_15 = i_22 + 1; if (n_10(D) > i_15) goto <bb 10>; [89.00%] else goto <bb 5>; [11.00%] <bb 10> [local count: 850510901]: goto <bb 4>; [100.00%] <bb 5> [local count: 118111600]: # sum_21 = PHI <sum_14(4), sum_11(3)> c[j_19] = sum_21; j_13 = j_19 + 1; if (n_10(D) > j_13) goto <bb 9>; [89.00%] else goto <bb 6>; [11.00%] <bb 9> [local count: 105119324]: goto <bb 3>; [100.00%] <bb 6> [local count: 14598063]: return; }
graph.ps
Description: PostScript document