> -----Original Message-----
> From: Richard Biener [mailto:richard.guent...@gmail.com]
> Sent: Wednesday, December 11, 2013 6:04 PM
> To: Jakub Jelinek
> Cc: Bin.Cheng; Jeff Law; Bin Cheng; gcc-patches List
> Subject: Re: [PATCH PR41488]Recognize more induction variables by
> simplifying PEELED chrec in scalar evolution
>
> On Wed, Dec 11, 2013 at 9:50 AM, Jakub Jelinek <ja...@redhat.com> wrote:
> > On Wed, Dec 11, 2013 at 04:31:55PM +0800, Bin.Cheng wrote:
> >> Thank both of you for being patient on this patch.
> >> I went through the documentation quickly and realized that I have to
> >> modify pointer-map structure to make it recognized by GC (maybe more
> >
> > Changing pointer_map at this point is IMHO not appropriate.
> >
> >> work suggested by Jakub). It seems I shouldn't include that task in
> >> this patch at this stage 3, I am thinking just call free_affine*
> >> function in the place it is created for SCEV. Of course, that makes
> >> it more expensive.
> >
> > Perhaps you could call free_affine_expand_cache say from
> > analyze_scalar_evolution, that is the innermost enclosing exported API
> > that indirectly calls your new code, but it seems it is also called
> > recursively from analyze_scalar_evolution_1 or functions it calls.
> > So maybe you'd need to rename analyze_scalar_evolution to
> > analyze_scalar_evolution_2 and adjust some calls to
> > analyze_scalar_evolution to the latter in tree-scalar-evolution.c, and
> > add analyze_scalar_evolution wrapper that calls
> analyze_scalar_evolution_2 and free_affine_expand_cache.
> > Whether this would work depends on detailed analysis of the
> > tree-scalar-evolution.c callgraph, there are tons of functions, most
> > of them static, and the question is if there is a single non-static
> > (or at most a few of them) function that cover all calls to your new
> > code in the static functions it (indirectly) calls, i.e. if there
> > isn't any other external entry point that might call your new code
> > without doing free_affine_expand_cache afterwards.
> >
> > If you can find that, I'd say it would be the safest thing to do.
> > But let's see what say Richard thinks about it too.
>
> I think keeping the SCEV cache live over multiple passes is fragile (we do
re-
> use SSA names and passes do not appropriately call scev_reset[_htab] in
all
> cases).
>
> With fixing that the easiest approach is to associate a affine-expand
cache
> with the scev info.
>
> Without that there isn't a very good solution other than discarding the
cache
> after each expansion at the point you use it. The SCEV cache should
> effectively make most of the affine expansion caching moot (but you can
try
> instrumenting that to verify it).
>
> That is, I suggest to try freeing the cache right after the second
> tree_to_aff_combination_expand.
>
> Btw,
>
> + /* Try harder to check if they are equal. */
> + tree_to_aff_combination_expand (left, type, &aff1, &peeled_chrec_map);
> + tree_to_aff_combination_expand (step_val, type, &aff2,
> + &peeled_chrec_map); aff_combination_scale (&aff2,
> + double_int_minus_one); aff_combination_add (&aff1, &aff2); left =
> + fold_convert (type, aff_combination_to_tree (&aff1));
> +
> + /* Transform (init, {left, right}_LOOP)_LOOP to {init, right}_LOOP
> + if "left" equals to "init + right". */ if (operand_equal_p
> + (left, integer_zero_node, 0))
>
> you don't need aff_combination_to_tree, simply check
>
> aff1.n == 0 && aff1.offset.is_zero ()
>
> (you may want to add aff_combination_zero_p or even
> aff_combination_equal_p)
Hi,
This is the new version patch with bug 59445 fixed and calling free_affine
stuff just after it is used. I also added aff_combination_zero_p as
suggested.
The change in add_old_iv_candidates is unnecessary now, since the previous
version patch triggered the assertion, I just leave it here as an additional
check.
At last, apology that I missed java in bootstrap before.
Bootstrap and test on x86/x86_64, arm is still ongoing. Is it OK if tests
pass.
Thanks,
bin
2013-12-12 Bin Cheng <bin.ch...@arm.com>
PR tree-optimization/58296
PR tree-optimization/41488
* tree-scalar-evolution.c: Include necessary header files.
(simplify_peeled_chrec): New function.
(analyze_evolution_in_loop): New static variable.
Call simplify_peeled_chrec.
* tree-ssa-loop-ivopts.c (mark_bivs): Don't mark peeled IV as biv.
(add_old_iv_candidates): Don't add candidate for peeled IV.
* tree-affine.h (aff_combination_zero_p): New function.
gcc/testsuite/ChangeLog
2013-12-12 Bin Cheng <bin.ch...@arm.com>
PR tree-optimization/58296
PR tree-optimization/41488
* gcc.dg/tree-ssa/scev-7.c: New test.
* gcc.dg/pr41488.c: New test.
* g++.dg/pr59445.C: New test.
Index: gcc/tree-scalar-evolution.c
===================================================================
--- gcc/tree-scalar-evolution.c (revision 205798)
+++ gcc/tree-scalar-evolution.c (working copy)
@@ -280,6 +280,8 @@ along with GCC; see the file COPYING3. If not see
#include "tree-ssa.h"
#include "cfgloop.h"
#include "tree-chrec.h"
+#include "pointer-set.h"
+#include "tree-affine.h"
#include "tree-scalar-evolution.h"
#include "dumpfile.h"
#include "params.h"
@@ -1380,7 +1382,65 @@ follow_ssa_edge (struct loop *loop, gimple def, gi
}
+/* Simplify PEELED_CHREC represented by (init_cond, arg) in LOOP.
+ Handle below case and return the corresponding POLYNOMIAL_CHREC:
+ # i_17 = PHI <i_13(5), 0(3)>
+ # _20 = PHI <_5(5), start_4(D)(3)>
+ ...
+ i_13 = i_17 + 1;
+ _5 = start_4(D) + i_13;
+
+ Though variable _20 appears as a PEELED_CHREC in the form of
+ (start_4, _5)_LOOP, it's a POLYNOMIAL_CHREC like {start_4, 1}_LOOP.
+
+ See PR41488. */
+
+static tree
+simplify_peeled_chrec (struct loop *loop, tree arg, tree init_cond)
+{
+ aff_tree aff1, aff2;
+ tree ev, left, right, type, step_val;
+ pointer_map_t *peeled_chrec_map = NULL;
+
+ ev = instantiate_parameters (loop, analyze_scalar_evolution (loop, arg));
+ if (ev == NULL_TREE || TREE_CODE (ev) != POLYNOMIAL_CHREC)
+ return chrec_dont_know;
+
+ left = CHREC_LEFT (ev);
+ right = CHREC_RIGHT (ev);
+ type = TREE_TYPE (left);
+ step_val = chrec_fold_plus (type, init_cond, right);
+
+ /* Transform (init, {left, right}_LOOP)_LOOP to {init, right}_LOOP
+ if "left" equals to "init + right". */
+ if (operand_equal_p (left, step_val, 0))
+ {
+ if (dump_file && (dump_flags & TDF_SCEV))
+ fprintf (dump_file, "Simplify PEELED_CHREC into POLYNOMIAL_CHREC.\n");
+
+ return build_polynomial_chrec (loop->num, init_cond, right);
+ }
+
+ /* Try harder to check if they are equal. */
+ tree_to_aff_combination_expand (left, type, &aff1, &peeled_chrec_map);
+ tree_to_aff_combination_expand (step_val, type, &aff2, &peeled_chrec_map);
+ free_affine_expand_cache (&peeled_chrec_map);
+ aff_combination_scale (&aff2, double_int_minus_one);
+ aff_combination_add (&aff1, &aff2);
+
+ /* Transform (init, {left, right}_LOOP)_LOOP to {init, right}_LOOP
+ if "left" equals to "init + right". */
+ if (aff_combination_zero_p (&aff1))
+ {
+ if (dump_file && (dump_flags & TDF_SCEV))
+ fprintf (dump_file, "Simplify PEELED_CHREC into POLYNOMIAL_CHREC.\n");
+
+ return build_polynomial_chrec (loop->num, init_cond, right);
+ }
+ return chrec_dont_know;
+}
+
/* Given a LOOP_PHI_NODE, this function determines the evolution
function from LOOP_PHI_NODE to LOOP_PHI_NODE in the loop. */
@@ -1392,6 +1452,7 @@ analyze_evolution_in_loop (gimple loop_phi_node,
tree evolution_function = chrec_not_analyzed_yet;
struct loop *loop = loop_containing_stmt (loop_phi_node);
basic_block bb;
+ static bool simplify_peeled_chrec_p = true;
if (dump_file && (dump_flags & TDF_SCEV))
{
@@ -1442,7 +1503,19 @@ analyze_evolution_in_loop (gimple loop_phi_node,
all the other iterations it has the value of ARG.
For the moment, PEELED_CHREC nodes are not built. */
if (res != t_true)
- ev_fn = chrec_dont_know;
+ {
+ ev_fn = chrec_dont_know;
+ /* Try to recognize POLYNOMIAL_CHREC which appears in
+ the form of PEELED_CHREC, but guard the process with
+ a bool variable to keep the analyzer from infinite
+ recurrence for real PEELED_RECs. */
+ if (simplify_peeled_chrec_p && TREE_CODE (arg) == SSA_NAME)
+ {
+ simplify_peeled_chrec_p = false;
+ ev_fn = simplify_peeled_chrec (loop, arg, init_cond);
+ simplify_peeled_chrec_p = true;
+ }
+ }
/* When there are multiple back edges of the loop (which in fact never
happens currently, but nevertheless), merge their evolutions. */
Index: gcc/testsuite/gcc.dg/tree-ssa/scev-7.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/scev-7.c (revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/scev-7.c (revision 0)
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-sccp-scev" } */
+
+struct struct_t
+{
+ int* data;
+};
+
+void foo (struct struct_t* sp, int start, int end)
+{
+ int i;
+
+ for (i = 1000; i+start > end; i--)
+ sp->data[i+start] = 0;
+}
+
+/* { dg-final { scan-tree-dump-times "Simplify PEELED_CHREC into
POLYNOMIAL_CHREC" 1 "sccp" } } */
+/* { dg-final { cleanup-tree-dump "sccp" } } */
Index: gcc/testsuite/gcc.dg/pr41488.c
===================================================================
--- gcc/testsuite/gcc.dg/pr41488.c (revision 0)
+++ gcc/testsuite/gcc.dg/pr41488.c (revision 0)
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-sccp-scev" } */
+
+struct struct_t
+{
+ int* data;
+};
+
+void foo (struct struct_t* sp, int start, int end)
+{
+ int i;
+
+ for (i = 0; i+start < end; i++)
+ sp->data[i+start] = 0;
+}
+
+/* { dg-final { scan-tree-dump-times "Simplify PEELED_CHREC into
POLYNOMIAL_CHREC" 1 "sccp" } } */
+/* { dg-final { cleanup-tree-dump "sccp" } } */
Index: gcc/testsuite/g++.dg/pr59445.C
===================================================================
--- gcc/testsuite/g++.dg/pr59445.C (revision 0)
+++ gcc/testsuite/g++.dg/pr59445.C (revision 0)
@@ -0,0 +1,81 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+template <typename _Iterator> struct A;
+template <typename _Tp> struct A<_Tp *> {
+ typedef _Tp value_type;
+ typedef int difference_type;
+};
+template <typename _Compare> struct B {};
+template <typename _Compare> struct C {
+ _Compare _M_comp;
+ template <typename _Value, typename _Iterator>
+ int operator()(_Value &p1, _Iterator p2) {
+ return _M_comp(p1, *p2);
+ }
+};
+template <typename _Compare> C<_Compare> __val_comp_iter(B<_Compare>);
+template <typename _RandomAccessIterator, typename _Compare>
+void __unguarded_linear_insert(_RandomAccessIterator p1, _Compare p2) {
+ typename A<_RandomAccessIterator>::value_type a;
+ _RandomAccessIterator b = p1;
+ --b;
+ while (p2(a, b)) {
+ *p1 = 0;
+ p1 = b;
+ --b;
+ }
+}
+template <typename _RandomAccessIterator, typename _Compare>
+void __insertion_sort(_RandomAccessIterator, _Compare p2) {
+ for (_RandomAccessIterator c;; ++c)
+ __unguarded_linear_insert(c, __val_comp_iter(p2));
+}
+template <typename _RandomAccessIterator, typename _Distance, typename
_Compare>
+void __chunk_insertion_sort(_RandomAccessIterator, _Distance, _Compare p3) {
+ _RandomAccessIterator d;
+ __insertion_sort(d, p3);
+}
+template <typename _RandomAccessIterator, typename _Pointer, typename _Compare>
+void __merge_sort_with_buffer(_RandomAccessIterator p1, _Pointer, _Compare p3)
{
+ __chunk_insertion_sort(p1, 0, p3);
+}
+template <typename _RandomAccessIterator, typename _Pointer, typename
_Distance,
+ typename _Compare>
+void __stable_sort_adaptive(_RandomAccessIterator, _Pointer, _Distance,
+ _Compare p4) {
+ _RandomAccessIterator e;
+ __merge_sort_with_buffer(e, 0, p4);
+}
+template <typename _RandomAccessIterator, typename _Compare>
+void __stable_sort(_RandomAccessIterator p1, _Compare p2) {
+ __stable_sort_adaptive(
+ p1, 0, typename A<_RandomAccessIterator>::difference_type(), p2);
+}
+template <typename _RandomAccessIterator, typename _Compare>
+void stable_sort(_RandomAccessIterator, _RandomAccessIterator p2, _Compare) {
+ B<_Compare> f;
+ __stable_sort(p2, f);
+}
+class D {
+public:
+ void m_fn1();
+};
+class F {
+ struct G {
+ D MFI;
+ int operator()(int p1, int p2) {
+ if (p1)
+ return 0;
+ if (p2)
+ return 1;
+ MFI.m_fn1();
+ }
+ };
+ void m_fn1(int &p1) const;
+};
+void F::m_fn1(int &p1) const {
+ int *g, *h;
+ stable_sort(h, g, G());
+}
+
Index: gcc/tree-ssa-loop-ivopts.c
===================================================================
--- gcc/tree-ssa-loop-ivopts.c (revision 205798)
+++ gcc/tree-ssa-loop-ivopts.c (working copy)
@@ -1074,7 +1074,7 @@ find_bivs (struct ivopts_data *data)
static void
mark_bivs (struct ivopts_data *data)
{
- gimple phi;
+ gimple phi, def;
tree var;
struct iv *iv, *incr_iv;
struct loop *loop = data->current_loop;
@@ -1090,6 +1090,13 @@ mark_bivs (struct ivopts_data *data)
continue;
var = PHI_ARG_DEF_FROM_EDGE (phi, loop_latch_edge (loop));
+ def = SSA_NAME_DEF_STMT (var);
+ /* Don't mark iv peeled from other one as biv. */
+ if (def
+ && gimple_code (def) == GIMPLE_PHI
+ && gimple_bb (def) == loop->header)
+ continue;
+
incr_iv = get_iv (data, var);
if (!incr_iv)
continue;
@@ -2526,11 +2533,19 @@ add_old_iv_candidates (struct ivopts_data *data, s
/* Additionally record the possibility of leaving the original iv
untouched. */
def = PHI_ARG_DEF_FROM_EDGE (phi, loop_latch_edge (data->current_loop));
- cand = add_candidate_1 (data,
- iv->base, iv->step, true, IP_ORIGINAL, NULL,
- SSA_NAME_DEF_STMT (def));
- cand->var_before = iv->ssa_name;
- cand->var_after = def;
+ /* Don't add candidate if it's from another PHI node because
+ it's an affine iv appearing in the form of PEELED_CHREC. */
+ phi = SSA_NAME_DEF_STMT (def);
+ if (gimple_code (phi) != GIMPLE_PHI)
+ {
+ cand = add_candidate_1 (data,
+ iv->base, iv->step, true, IP_ORIGINAL, NULL,
+ SSA_NAME_DEF_STMT (def));
+ cand->var_before = iv->ssa_name;
+ cand->var_after = def;
+ }
+ else
+ gcc_assert (gimple_bb (phi) == data->current_loop->header);
}
}
Index: gcc/tree-affine.h
===================================================================
--- gcc/tree-affine.h (revision 205798)
+++ gcc/tree-affine.h (working copy)
@@ -80,3 +80,16 @@ bool aff_comb_cannot_overlap_p (aff_tree *, double
/* Debugging functions. */
void debug_aff (aff_tree *);
+
+/* Return true if AFF is actually ZERO. */
+static inline bool
+aff_combination_zero_p (aff_tree *aff)
+{
+ if (!aff)
+ return true;
+
+ if (aff->n == 0 && aff->offset.is_zero ())
+ return true;
+
+ return false;
+}