On Fri, Jul 1, 2016 at 11:33 AM, Richard Biener <richard.guent...@gmail.com> wrote: > On Tue, Jun 28, 2016 at 8:18 AM, Bin Cheng <bin.ch...@arm.com> wrote: >> Hi, >> At the moment, loop niter analyzer depends on simple_iv to understand >> control induction variable in order to do further niter analysis. For cases >> reported in PR57558 (comment #4), the control variable is not an SCEV >> because it's converted from an smaller type and that type could >> overflow/wrap. As a result, niter information for such loops won't be >> analyzed because number_of_iterations_exit exits immediately once simple_iv >> fails. As a matter of fact, niter analyzer can be improved by computing an >> additional assumption under which the loop won't be infinite (or the >> corresponding iv doesn't overflow). This patch does so by changing both >> scev and niter analyzer. It introduces a new interface >> simple_iv_with_niters which is used in niter analyzer. The new interface >> returns an IV as well as a possible niter expression, the expression >> indicates the maximum number the IV can iterate before the returned >> simple_iv becoming invalid. Given below example: >> >> short unsigned int i; >> int _1; >> >> <bb 2>: >> goto <bb 4>; >> >> <bb 3>: >> arr[_1] = -1; >> i_6 = i_2 + 1; >> >> <bb 4>: >> # i_2 = PHI <1(2), i_6(3)> >> _1 = (int) i_2; >> if (_1 != 199) >> goto <bb 3>; >> else >> goto <bb 5>; >> >> <bb 5>: >> return; >> >> When analyzing variable "_1", the old interface simple_iv returns false, >> while the new interface returns <{1, 1}_loop, 65534>. It means "_1" is a >> valid SCEV as long as it doesn't iterate more than 65534 times. >> This patch also introduces a new interface in niter analyzer >> number_of_iterations_exit_assumptions. The new interface further derives >> assumption from the result of simple_iv_with_niters, and the assumption can >> be used by other optimizers. As for this loop, it's an unexpected gain >> because assumption (198 < 65534) is naturally TRUE. >> For loops that could be infinite, the assumption will be an expression. >> This improvement is still good, for example, the next patch to will >> vectorize such loops with help of this patch. >> >> This patch also changes how assumptions is handled in niter analyzer. At >> the moment, (non-true) assumptions are not supported unless >> -funsafe-loop-optimizations are specified, after this patch, the new >> interface exposes assumptions to niter user and let them make their own >> decision. In effect, this patch discards -funsafe-loop-optimizations on >> GIMPLE level, which we think is not a very useful option anyway. Next patch >> will xfails tests for this option. Well, the option is not totally >> discarded because it requires RTL changes too. I will follow up that after >> gimple part change. >> >> Bootstrap and test on x86_64 and AArch64. Is it OK? > > Please make simple_iv_with_niters accept NULL as iv_niters and pass > NULL from simple_iv to avoid useless work. > > You have chosen to make the flags per loop instead of say, flags on > the global loop state. The patch itself doesn't set the flag > on any loop thus it doesn't really belong into this patch IMHO, so > maybe you can split it out. We do already have a plethora > of "flags" (badly packed) in struct loop and while I see the interface > is sth very specific adding another 4 bytes doesn't look > too nice here (but maybe we don't care, struct loop isn't allocated > that much hopefully). I'd like to see a better comment > before the flags part in cfgloop.h though noting that those are not > flags computed by the compiler but flags that are set > by the consumer that affect semantics of certain (document which) > APIs. And that consumers are expected to re-set > flags back! (failing to do that might be a source of hard to track down > bugs?) > > Anyway, the _assumtions part of the patch is ok with the suggested change. > Hi Richard, According to your suggestion, I split the original patch into two, and this is the first part. It improves scalar evolution and loop niter analyzer so that (possible) infinite loop can be handled. As a bonus point, it also picks up normal loops which were missed before, for example, loop in the added test.
Bootstrap and test on x86_64 ongoing, Is it OK? Thanks, bin 2016-07-13 Bin Cheng <bin.ch...@arm.com> * tree-scalar-evolution.c (simple_iv_with_niters): New funcion. (derive_simple_iv_with_niters): New function. (simple_iv): Rewrite using simple_iv_with_niters. * tree-scalar-evolution.h (simple_iv_with_niters): New decl. * tree-ssa-loop-niter.c (number_of_iterations_exit_assumptions): New function. (number_of_iterations_exit): Rewrite using above function. * tree-ssa-loop-niter.h (number_of_iterations_exit_assumptions): New Decl. gcc/testsuite/ChangeLog 2016-07-13 Bin Cheng <bin.ch...@arm.com> * gcc.dg/tree-ssa/loop-41.c: New test.
diff --git a/gcc/tree-scalar-evolution.c b/gcc/tree-scalar-evolution.c index ae5e907..d8ed84a 100644 --- a/gcc/tree-scalar-evolution.c +++ b/gcc/tree-scalar-evolution.c @@ -3393,6 +3393,55 @@ iv_can_overflow_p (struct loop *loop, tree type, tree base, tree step) return false; } +/* Given EV with form of "(type) {inner_base, inner_step}_loop", this + function tries to derive condition under which it can be simplified + into "{(type)inner_base, (type)inner_step}_loop". The condition is + the maximum number that inner iv can iterate. */ + +static tree +derive_simple_iv_with_niters (tree ev, tree *niters) +{ + if (!CONVERT_EXPR_P (ev)) + return ev; + + tree inner_ev = TREE_OPERAND (ev, 0); + if (TREE_CODE (inner_ev) != POLYNOMIAL_CHREC) + return ev; + + tree init = CHREC_LEFT (inner_ev); + tree step = CHREC_RIGHT (inner_ev); + if (TREE_CODE (init) != INTEGER_CST + || TREE_CODE (step) != INTEGER_CST || integer_zerop (step)) + return ev; + + tree type = TREE_TYPE (ev); + tree inner_type = TREE_TYPE (inner_ev); + if (TYPE_PRECISION (inner_type) >= TYPE_PRECISION (type)) + return ev; + + /* Type conversion in "(type) {inner_base, inner_step}_loop" can be + folded only if inner iv won't overflow. We compute the maximum + number the inner iv can iterate before overflowing and return the + simplified affine iv. */ + tree delta; + init = fold_convert (type, init); + step = fold_convert (type, step); + ev = build_polynomial_chrec (CHREC_VARIABLE (inner_ev), init, step); + if (tree_int_cst_sign_bit (step)) + { + tree bound = lower_bound_in_type (inner_type, inner_type); + delta = fold_build2 (MINUS_EXPR, type, init, fold_convert (type, bound)); + step = fold_build1 (NEGATE_EXPR, type, step); + } + else + { + tree bound = upper_bound_in_type (inner_type, inner_type); + delta = fold_build2 (MINUS_EXPR, type, fold_convert (type, bound), init); + } + *niters = fold_build2 (FLOOR_DIV_EXPR, type, delta, step); + return ev; +} + /* Checks whether use of OP in USE_LOOP behaves as a simple affine iv with respect to WRTO_LOOP and returns its base and step in IV if possible (see analyze_scalar_evolution_in_loop for more details on USE_LOOP @@ -3410,13 +3459,29 @@ iv_can_overflow_p (struct loop *loop, tree type, tree base, tree step) not wrap by some other argument. Otherwise, this might introduce undefined behavior, and - for (i = iv->base; ; i = (type) ((unsigned type) i + (unsigned type) iv->step)) + i = iv->base; + for (; ; i = (type) ((unsigned type) i + (unsigned type) iv->step)) + + must be used instead. + + When IV_NITERS is not NULL, this function also checks case in which OP + is a conversion of an inner simple iv of below form: + + (outer_type){inner_base, inner_step}_loop. - must be used instead. */ + If type of inner iv has smaller precision than outer_type, it can't be + folded into {(outer_type)inner_base, (outer_type)inner_step}_loop because + the inner iv could overflow/wrap. In this case, we derive a condition + under which the inner iv won't overflow/wrap and do the simplification. + The derived condition normally is the maximum number the inner iv can + iterate, and will be stored in IV_NITERS. This is useful in loop niter + analysis, to derive break conditions when a loop must terminate, when is + infinite. */ bool -simple_iv (struct loop *wrto_loop, struct loop *use_loop, tree op, - affine_iv *iv, bool allow_nonconstant_step) +simple_iv_with_niters (struct loop *wrto_loop, struct loop *use_loop, + tree op, affine_iv *iv, tree *iv_niters, + bool allow_nonconstant_step) { enum tree_code code; tree type, ev, base, e, stop; @@ -3446,8 +3511,14 @@ simple_iv (struct loop *wrto_loop, struct loop *use_loop, tree op, return true; } - if (TREE_CODE (ev) != POLYNOMIAL_CHREC - || CHREC_VARIABLE (ev) != (unsigned) wrto_loop->num) + /* If we can derive valid scalar evolution with assumptions. */ + if (iv_niters && TREE_CODE (ev) != POLYNOMIAL_CHREC) + ev = derive_simple_iv_with_niters (ev, iv_niters); + + if (TREE_CODE (ev) != POLYNOMIAL_CHREC) + return false; + + if (CHREC_VARIABLE (ev) != (unsigned) wrto_loop->num) return false; iv->step = CHREC_RIGHT (ev); @@ -3544,6 +3615,17 @@ simple_iv (struct loop *wrto_loop, struct loop *use_loop, tree op, return true; } +/* Like simple_iv_with_niters, but return TRUE when OP behaves as a simple + affine iv unconditionally. */ + +bool +simple_iv (struct loop *wrto_loop, struct loop *use_loop, tree op, + affine_iv *iv, bool allow_nonconstant_step) +{ + return simple_iv_with_niters (wrto_loop, use_loop, op, iv, + NULL, allow_nonconstant_step); +} + /* Finalize the scalar evolution analysis. */ void diff --git a/gcc/tree-scalar-evolution.h b/gcc/tree-scalar-evolution.h index 382d717..a77e452 100644 --- a/gcc/tree-scalar-evolution.h +++ b/gcc/tree-scalar-evolution.h @@ -36,6 +36,8 @@ extern void gather_stats_on_scev_database (void); extern void final_value_replacement_loop (struct loop *); extern unsigned int scev_const_prop (void); extern bool expression_expensive_p (tree); +extern bool simple_iv_with_niters (struct loop *, struct loop *, tree, + struct affine_iv *, tree *, bool); extern bool simple_iv (struct loop *, struct loop *, tree, struct affine_iv *, bool); extern bool iv_can_overflow_p (struct loop *, tree, tree, tree); diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c index 0723752..5bc1c00 100644 --- a/gcc/tree-ssa-loop-niter.c +++ b/gcc/tree-ssa-loop-niter.c @@ -2186,20 +2186,17 @@ loop_only_exit_p (const struct loop *loop, const_edge exit) } /* Stores description of number of iterations of LOOP derived from - EXIT (an exit edge of the LOOP) in NITER. Returns true if some - useful information could be derived (and fields of NITER has - meaning described in comments at struct tree_niter_desc - declaration), false otherwise. If WARN is true and - -Wunsafe-loop-optimizations was given, warn if the optimizer is going to use - potentially unsafe assumptions. + EXIT (an exit edge of the LOOP) in NITER. Returns true if some useful + information could be derived (and fields of NITER have meaning described + in comments at struct tree_niter_desc declaration), false otherwise. When EVERY_ITERATION is true, only tests that are known to be executed - every iteration are considered (i.e. only test that alone bounds the loop). + every iteration are considered (i.e. only test that alone bounds the loop). */ bool -number_of_iterations_exit (struct loop *loop, edge exit, - struct tree_niter_desc *niter, - bool warn, bool every_iteration) +number_of_iterations_exit_assumptions (struct loop *loop, edge exit, + struct tree_niter_desc *niter, + bool every_iteration) { gimple *last; gcond *stmt; @@ -2251,9 +2248,16 @@ number_of_iterations_exit (struct loop *loop, edge exit, && !POINTER_TYPE_P (type)) return false; - if (!simple_iv (loop, loop_containing_stmt (stmt), op0, &iv0, false)) + tree iv0_niters = NULL_TREE; + if (!simple_iv_with_niters (loop, loop_containing_stmt (stmt), + op0, &iv0, &iv0_niters, false)) return false; - if (!simple_iv (loop, loop_containing_stmt (stmt), op1, &iv1, false)) + tree iv1_niters = NULL_TREE; + if (!simple_iv_with_niters (loop, loop_containing_stmt (stmt), + op1, &iv1, &iv1_niters, false)) + return false; + /* Give up on complicated case. */ + if (iv0_niters && iv1_niters) return false; /* We don't want to see undefined signed overflow warnings while @@ -2269,6 +2273,24 @@ number_of_iterations_exit (struct loop *loop, edge exit, return false; } + /* Incorporate additional assumption implied by control iv. */ + tree iv_niters = iv0_niters ? iv0_niters : iv1_niters; + if (iv_niters) + { + tree assumption = fold_build2 (LE_EXPR, boolean_type_node, niter->niter, + fold_convert (TREE_TYPE (niter->niter), + iv_niters)); + + if (!integer_nonzerop (assumption)) + niter->assumptions = fold_build2 (TRUTH_AND_EXPR, boolean_type_node, + niter->assumptions, assumption); + + /* Refine upper bound if possible. */ + if (TREE_CODE (iv_niters) == INTEGER_CST + && niter->max > wi::to_widest (iv_niters)) + niter->max = wi::to_widest (iv_niters); + } + if (optimize >= 3) { niter->assumptions = simplify_using_outer_evolutions (loop, @@ -2291,44 +2313,22 @@ number_of_iterations_exit (struct loop *loop, edge exit, if (TREE_CODE (niter->niter) == INTEGER_CST) niter->max = wi::to_widest (niter->niter); - if (integer_onep (niter->assumptions)) - return true; - - /* With -funsafe-loop-optimizations we assume that nothing bad can happen. - But if we can prove that there is overflow or some other source of weird - behavior, ignore the loop even with -funsafe-loop-optimizations. */ - if (integer_zerop (niter->assumptions) || !single_exit (loop)) - return false; - - if (flag_unsafe_loop_optimizations) - niter->assumptions = boolean_true_node; + return (!integer_zerop (niter->assumptions)); +} - if (warn) - { - const char *wording; - location_t loc = gimple_location (stmt); - - /* We can provide a more specific warning if one of the operator is - constant and the other advances by +1 or -1. */ - if (!integer_zerop (iv1.step) - ? (integer_zerop (iv0.step) - && (integer_onep (iv1.step) || integer_all_onesp (iv1.step))) - : (integer_onep (iv0.step) || integer_all_onesp (iv0.step))) - wording = - flag_unsafe_loop_optimizations - ? N_("assuming that the loop is not infinite") - : N_("cannot optimize possibly infinite loops"); - else - wording = - flag_unsafe_loop_optimizations - ? N_("assuming that the loop counter does not overflow") - : N_("cannot optimize loop, the loop counter may overflow"); +/* Like number_of_iterations_exit, but return TRUE only if the niter + information holds unconditionally. */ - warning_at ((LOCATION_LINE (loc) > 0) ? loc : input_location, - OPT_Wunsafe_loop_optimizations, "%s", gettext (wording)); - } +bool +number_of_iterations_exit (struct loop *loop, edge exit, + struct tree_niter_desc *niter, + bool, bool every_iteration) +{ + if (!number_of_iterations_exit_assumptions (loop, exit, niter, + every_iteration)) + return false; - return flag_unsafe_loop_optimizations; + return (integer_nonzerop (niter->assumptions)); } /* Try to determine the number of iterations of LOOP. If we succeed, diff --git a/gcc/tree-ssa-loop-niter.h b/gcc/tree-ssa-loop-niter.h index 1b5d111..1aea580 100644 --- a/gcc/tree-ssa-loop-niter.h +++ b/gcc/tree-ssa-loop-niter.h @@ -27,6 +27,9 @@ extern bool loop_only_exit_p (const struct loop *, const_edge); extern bool number_of_iterations_exit (struct loop *, edge, struct tree_niter_desc *niter, bool, bool every_iteration = true); +extern bool number_of_iterations_exit_assumptions (struct loop *, edge, + struct tree_niter_desc *, + bool = true); extern tree find_loop_niter (struct loop *, edge *); extern bool finite_loop_p (struct loop *); extern tree loop_niter_by_eval (struct loop *, edge); diff --git a/gcc/testsuite/gcc.dg/tree-ssa/loop-41.c b/gcc/testsuite/gcc.dg/tree-ssa/loop-41.c new file mode 100644 index 0000000..52ba968 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/loop-41.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-O1 -ftree-vrp -fdump-tree-vrp-alias" } */ + +signed char arr[240]; +void foo (void) +{ + + unsigned short i, length = 200; + + for (i = 1; (int)i < (length - 1); i++) + arr[i] = -1; +} + +/* { dg-final { scan-tree-dump-not "RANGE \\\[0, 65535\\\]" "vrp1" } } */