On 2022-01-03 22:30, Richard Biener wrote:
On Wed, 22 Dec 2021, Jiufu Guo wrote:

Hi,

Normaly, estimate_numbers_of_iterations get/caculate niter first,
and then invokes infer_loop_bounds_from_undefined. While in some case,
after a few call stacks, estimate_numbers_of_iterations is invoked before
niter is ready (e.g. before number_of_latch_executions returns).

e.g. number_of_latch_executions->...follow_ssa_edge_expr-->
--> estimate_numbers_of_iterations --> infer_loop_bounds_from_undefined.

Since niter is still not computed, call to infer_loop_bounds_from_undefined
may not get final result.
To avoid infer_loop_bounds_from_undefined to be called with interim state and avoid infer_loop_bounds_from_undefined generates interim data, during niter's computing, we could disable flag_aggressive_loop_optimizations.

Bootstrap and regtest pass on ppc64* and x86_64. Is this ok for trunk?

So this is a optimality fix, not a correctness one?  I suppose the
estimates are computed/used from scev_probably_wraps_p via
loop_exits_before_overflow and ultimatively chrec_convert.

We have a call cycle here,

estimate_numbers_of_iterations -> number_of_latch_executions ->
... -> estimate_numbers_of_iterations

where the first estimate_numbers_of_iterations will make sure
the later call will immediately return.

Hi Richard,
Thanks for your comments! And sorry for the late reply.

In estimate_numbers_of_iterations, there is a guard to make sure
the second call to estimate_numbers_of_iterations returns
immediately.

Exactly as you said, it relates to scev_probably_wraps_p calls
loop_exits_before_overflow.

The issue is: the first calling to estimate_numbers_of_iterations
maybe inside number_of_latch_executions.


I'm not sure what your patch tries to do - it seems to tackle
the case where we enter the cycle via number_of_latch_executions?
Why do we get "non-final" values?  idx_infer_loop_bounds resorts

Right, when the call cycle starts from number_of_latch_execution,
the issue may occur:

number_of_latch_executions(*1st call)->..->
analyze_scalar_evolution(IVs 1st) ->..follow_ssa_edge_expr..->
loop_exits_before_overflow->
estimate_numbers_of_iterations (*1st call)->
number_of_latch_executions(*2nd call)->..->
analyze_scalar_evolution(IVs 2nd)->..loop_exits_before_overflow-> estimate_numbers_of_iterations(*2nd call)

The second calling to estimate_numbers_of_iterations returns quickly.
And then, in the first calling to estimate_numbers_of_iterations,
infer_loop_bounds_from_undefined is invoked.

And, function "infer_loop_bounds_from_undefined" instantiate/analyze
SCEV for each SSA in the loop.
*Here the issue occur*, these SCEVs are based on the interim IV's
SCEV which come from "analyze_scalar_evolution(IVs 2nd)",
and those IV's SCEV will be overridden by up level
"analyze_scalar_evolution(IVs 1st)".

To handle this issue, disabling flag_aggressive_loop_optimizations
inside number_of_latch_executions is one method.
To avoid the issue in other cases, e.g. the call cycle starts from
number_of_iterations_exit or number_of_iterations_exit_assumptions,
this patch disable flag_aggressive_loop_optimizations inside
number_of_iterations_exit_assumptions.

Thanks again.

BR,
Jiufu

to SCEV and thus may recurse again - to me it would be more
logical to try avoid recursing in number_of_latch_executions by
setting ->nb_iterations to something early, maybe chrec_dont_know,
to signal we're using something we're just trying to compute.

Richard.

BR,
Jiufu

gcc/ChangeLog:

        * tree-ssa-loop-niter.c (number_of_iterations_exit_assumptions):
        Disable/restore flag_aggressive_loop_optimizations.

gcc/testsuite/ChangeLog:

        * gcc.dg/tree-ssa/scev-16.c: New test.

---
 gcc/tree-ssa-loop-niter.c               | 23 +++++++++++++++++++----
 gcc/testsuite/gcc.dg/tree-ssa/scev-16.c | 20 ++++++++++++++++++++
 2 files changed, 39 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/scev-16.c

diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index 06954e437f5..51bb501019e 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -2534,18 +2534,31 @@ number_of_iterations_exit_assumptions (class loop *loop, edge exit,
       && !POINTER_TYPE_P (type))
     return false;

+  /* Before niter is calculated, avoid to analyze interim state. */
+ int old_aggressive_loop_optimizations = flag_aggressive_loop_optimizations;
+  flag_aggressive_loop_optimizations = 0;
+
   tree iv0_niters = NULL_TREE;
   if (!simple_iv_with_niters (loop, loop_containing_stmt (stmt),
                              op0, &iv0, safe ? &iv0_niters : NULL, false))
-    return number_of_iterations_popcount (loop, exit, code, niter);
+    {
+ bool res = number_of_iterations_popcount (loop, exit, code, niter); + flag_aggressive_loop_optimizations = old_aggressive_loop_optimizations;
+      return res;
+    }
   tree iv1_niters = NULL_TREE;
   if (!simple_iv_with_niters (loop, loop_containing_stmt (stmt),
                              op1, &iv1, safe ? &iv1_niters : NULL, false))
-    return false;
+    {
+ flag_aggressive_loop_optimizations = old_aggressive_loop_optimizations;
+      return false;
+    }
   /* Give up on complicated case.  */
   if (iv0_niters && iv1_niters)
-    return false;
-
+    {
+ flag_aggressive_loop_optimizations = old_aggressive_loop_optimizations;
+      return false;
+    }
   /* We don't want to see undefined signed overflow warnings while
      computing the number of iterations.  */
   fold_defer_overflow_warnings ();
@@ -2565,6 +2578,7 @@ number_of_iterations_exit_assumptions (class loop *loop, edge exit,
                                  only_exit_p, safe))
     {
       fold_undefer_and_ignore_overflow_warnings ();
+ flag_aggressive_loop_optimizations = old_aggressive_loop_optimizations;
       return false;
     }

@@ -2608,6 +2622,7 @@ number_of_iterations_exit_assumptions (class loop *loop, edge exit,
                                               niter->may_be_zero);

   fold_undefer_and_ignore_overflow_warnings ();
+ flag_aggressive_loop_optimizations = old_aggressive_loop_optimizations;

   /* If NITER has simplified into a constant, update MAX.  */
   if (TREE_CODE (niter->niter) == INTEGER_CST)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/scev-16.c b/gcc/testsuite/gcc.dg/tree-ssa/scev-16.c
new file mode 100644
index 00000000000..708ffab88ca
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/scev-16.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ivopts-scev" } */
+
+/* Expect scalar_evolution = {(int) start_7(D), +, 1}_1), instead
+   (int) (short int) {(unsigned short) start_7(D), +, 1}_1 */
+
+int arr[1000];
+
+void
+s2i (short start, short end)
+{
+  int res = 0;
+  for (short i = start; i < end; i++)
+    {
+      int lv = i;
+      arr[lv] += lv;
+    }
+}
+
+/* { dg-final { scan-tree-dump-times "scalar_evolution = \{\\(int\\) start_\[0-9\]+\\(D\\), \\+, 1\}_1" 1 "ivopts" } } */

Reply via email to