Re: Gimple loop splitting

Andrew Pinski Mon, 25 Jul 2016 00:01:17 -0700

On Thu, Nov 12, 2015 at 8:52 AM, Michael Matz <m...@suse.de> wrote:
> Hello,
>
> this new pass implements loop iteration space splitting for loops that
> contain a conditional that's always true for one part of the iteration
> space and false for the other, i.e. such situations:
>
>   for (i = beg; i < end; i++)
>     if (i < p)
>       dothis();
>     else
>       dothat();
>
> this is transformed into roughly:
>
>   for (i = beg; i < p; i++)
>     dothis();
>   for (; i < end; i++)
>     dothat();
>
> Of course, not quite the above as there needs to be provisions for the
> border conditions, if e.g. 'p' is outside the original iteration space, or
> the conditional doesn't directly use the control IV, but some other, or
> the IV runs backwards.  The testcase checks many of these border
> conditions.
>
> This transformation is in itself a good one but can also be an enabler for
> the vectorizer.  It does increase code size, when the loop body contains
> also unconditional code (that one is duplicated), so we only transform hot
> loops.  I'm a bit unsure of the placement of the new pass, or if it should
> be an own pass at all.  Right now I've placed it after unswitching and
> scev_cprop, before loop distribution.  Ideally I think all three, together
> with loop fusion and an gimple unroller should be integrated into one loop
> nest optimizer, alas, we aren't there yet.
>
> I'm planning to work on loop fusion in the future as well, but that's not
> for GCC 6.
>
> I've regstrapped this pass enabled with -O2 on x86-64-linux, without
> regressions.  I've also checked cpu2006 (the non-fortran part) for
> correctness, not yet for performance.  In the end it should probably only
> be enabled for -O3+ (although if the whole loop body is conditional it
> makes sense to also have it with -O2 because code growth is very small
> then).
>
> So, okay for trunk?


What ever happened to this patch?  I was looking into doing this
myself today but I found this patch.
It is stage 1 of GCC 7, it might be a good idea to get this patch into GCC.

Thanks,
Andrew

>
>
> Ciao,
> Michael.
>         * passes.def (pass_loop_split): Add.
>         * timevar.def (TV_LOOP_SPLIT): Add.
>         * tree-pass.h (make_pass_loop_split): Declare.
>         * tree-ssa-loop-manip.h (rewrite_into_loop_closed_ssa_1): Declare.
>         * tree-ssa-loop-unswitch.c: Include tree-ssa-loop-manip.h,
>         cfganal.h, tree-chrec.h, tree-affine.h, tree-scalar-evolution.h,
>         gimple-pretty-print.h, gimple-fold.h, gimplify-me.h.
>         (split_at_bb_p, patch_loop_exit, find_or_create_guard_phi,
>         split_loop, tree_ssa_split_loops,
>         make_pass_loop_split): New functions.
>         (pass_data_loop_split): New.
>         (pass_loop_split): New.
>
> testsuite/
>         * gcc.dg/loop-split.c: New test.
>
> Index: passes.def
> ===================================================================
> --- passes.def  (revision 229763)
> +++ passes.def  (working copy)
> @@ -233,6 +233,7 @@ along with GCC; see the file COPYING3.
>           NEXT_PASS (pass_dce);
>           NEXT_PASS (pass_tree_unswitch);
>           NEXT_PASS (pass_scev_cprop);
> +         NEXT_PASS (pass_loop_split);
>           NEXT_PASS (pass_record_bounds);
>           NEXT_PASS (pass_loop_distribution);
>           NEXT_PASS (pass_copy_prop);
> Index: timevar.def
> ===================================================================
> --- timevar.def (revision 229763)
> +++ timevar.def (working copy)
> @@ -179,6 +179,7 @@ DEFTIMEVAR (TV_LIM                   , "
>  DEFTIMEVAR (TV_TREE_LOOP_IVCANON     , "tree canonical iv")
>  DEFTIMEVAR (TV_SCEV_CONST            , "scev constant prop")
>  DEFTIMEVAR (TV_TREE_LOOP_UNSWITCH    , "tree loop unswitching")
> +DEFTIMEVAR (TV_LOOP_SPLIT            , "loop splitting")
>  DEFTIMEVAR (TV_COMPLETE_UNROLL       , "complete unrolling")
>  DEFTIMEVAR (TV_TREE_PARALLELIZE_LOOPS, "tree parallelize loops")
>  DEFTIMEVAR (TV_TREE_VECTORIZATION    , "tree vectorization")
> Index: tree-pass.h
> ===================================================================
> --- tree-pass.h (revision 229763)
> +++ tree-pass.h (working copy)
> @@ -366,6 +366,7 @@ extern gimple_opt_pass *make_pass_tree_n
>  extern gimple_opt_pass *make_pass_tree_loop_init (gcc::context *ctxt);
>  extern gimple_opt_pass *make_pass_lim (gcc::context *ctxt);
>  extern gimple_opt_pass *make_pass_tree_unswitch (gcc::context *ctxt);
> +extern gimple_opt_pass *make_pass_loop_split (gcc::context *ctxt);
>  extern gimple_opt_pass *make_pass_predcom (gcc::context *ctxt);
>  extern gimple_opt_pass *make_pass_iv_canon (gcc::context *ctxt);
>  extern gimple_opt_pass *make_pass_scev_cprop (gcc::context *ctxt);
> Index: tree-ssa-loop-manip.h
> ===================================================================
> --- tree-ssa-loop-manip.h       (revision 229763)
> +++ tree-ssa-loop-manip.h       (working copy)
> @@ -24,6 +24,8 @@ typedef void (*transform_callback)(struc
>
>  extern void create_iv (tree, tree, tree, struct loop *, gimple_stmt_iterator 
> *,
>                        bool, tree *, tree *);
> +extern void rewrite_into_loop_closed_ssa_1 (bitmap, unsigned, int,
> +                                           struct loop *);
>  extern void rewrite_into_loop_closed_ssa (bitmap, unsigned);
>  extern void rewrite_virtuals_into_loop_closed_ssa (struct loop *);
>  extern void verify_loop_closed_ssa (bool);
> Index: tree-ssa-loop-unswitch.c
> ===================================================================
> --- tree-ssa-loop-unswitch.c    (revision 229763)
> +++ tree-ssa-loop-unswitch.c    (working copy)
> @@ -31,12 +31,20 @@ along with GCC; see the file COPYING3.
>  #include "tree-ssa.h"
>  #include "tree-ssa-loop-niter.h"
>  #include "tree-ssa-loop.h"
> +#include "tree-ssa-loop-manip.h"
>  #include "tree-into-ssa.h"
> +#include "cfganal.h"
>  #include "cfgloop.h"
> +#include "tree-chrec.h"
> +#include "tree-affine.h"
> +#include "tree-scalar-evolution.h"
>  #include "params.h"
>  #include "tree-inline.h"
>  #include "gimple-iterator.h"
> +#include "gimple-pretty-print.h"
>  #include "cfghooks.h"
> +#include "gimple-fold.h"
> +#include "gimplify-me.h"
>
>  /* This file implements the loop unswitching, i.e. transformation of loops 
> like
>
> @@ -842,4 +850,551 @@ make_pass_tree_unswitch (gcc::context *c
>    return new pass_tree_unswitch (ctxt);
>  }
>
> +/* Return true when BB inside LOOP is a potential iteration space
> +   split point, i.e. ends with a condition like "IV < comp", which
> +   is true on one side of the iteration space and false on the other,
> +   and the split point can be computed.  If so, also return the border
> +   point in *BORDER and the comparison induction variable in IV.  */
>
> +static tree
> +split_at_bb_p (struct loop *loop, basic_block bb, tree *border, affine_iv 
> *iv)
> +{
> +  gimple *last;
> +  gcond *stmt;
> +  affine_iv iv2;
> +
> +  /* BB must end in a simple conditional jump.  */
> +  last = last_stmt (bb);
> +  if (!last || gimple_code (last) != GIMPLE_COND)
> +    return NULL_TREE;
> +  stmt = as_a <gcond *> (last);
> +
> +  enum tree_code code = gimple_cond_code (stmt);
> +
> +  /* Only handle relational comparisons, for equality and non-equality
> +     we'd have to split the loop into two loops and a middle statement.  */
> +  switch (code)
> +    {
> +      case LT_EXPR:
> +      case LE_EXPR:
> +      case GT_EXPR:
> +      case GE_EXPR:
> +       break;
> +      default:
> +       return NULL_TREE;
> +    }
> +
> +  if (loop_exits_from_bb_p (loop, bb))
> +    return NULL_TREE;
> +
> +  tree op0 = gimple_cond_lhs (stmt);
> +  tree op1 = gimple_cond_rhs (stmt);
> +
> +  if (!simple_iv (loop, loop, op0, iv, false))
> +    return NULL_TREE;
> +  if (!simple_iv (loop, loop, op1, &iv2, false))
> +    return NULL_TREE;
> +
> +  /* Make it so, that the first argument of the condition is
> +     the looping one.  */
> +  if (integer_zerop (iv->step))
> +    {
> +      std::swap (op0, op1);
> +      std::swap (*iv, iv2);
> +      code = swap_tree_comparison (code);
> +      gimple_cond_set_condition (stmt, code, op0, op1);
> +      update_stmt (stmt);
> +    }
> +
> +  if (integer_zerop (iv->step))
> +    return NULL_TREE;
> +  if (!integer_zerop (iv2.step))
> +    return NULL_TREE;
> +
> +  if (dump_file && (dump_flags & TDF_DETAILS))
> +    {
> +      fprintf (dump_file, "Found potential split point: ");
> +      print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
> +      fprintf (dump_file, " { ");
> +      print_generic_expr (dump_file, iv->base, TDF_SLIM);
> +      fprintf (dump_file, " + I*");
> +      print_generic_expr (dump_file, iv->step, TDF_SLIM);
> +      fprintf (dump_file, " } %s ", get_tree_code_name (code));
> +      print_generic_expr (dump_file, iv2.base, TDF_SLIM);
> +      fprintf (dump_file, "\n");
> +    }
> +
> +  *border = iv2.base;
> +  return op0;
> +}
> +
> +/* Given a GUARD conditional stmt inside LOOP, which we want to make always
> +   true or false depending on INITIAL_TRUE, and adjusted values NEXTVAL
> +   (a post-increment IV) and NEWBOUND (the comparator) adjust the loop
> +   exit test statement to loop back only if the GUARD statement will
> +   also be true/false in the next iteration.  */
> +
> +static void
> +patch_loop_exit (struct loop *loop, gcond *guard, tree nextval, tree 
> newbound,
> +                bool initial_true)
> +{
> +  edge exit = single_exit (loop);
> +  gcond *stmt = as_a <gcond *> (last_stmt (exit->src));
> +  gimple_cond_set_condition (stmt, gimple_cond_code (guard),
> +                            nextval, newbound);
> +  update_stmt (stmt);
> +
> +  edge stay = single_pred_edge (loop->latch);
> +
> +  exit->flags &= ~(EDGE_TRUE_VALUE | EDGE_FALSE_VALUE);
> +  stay->flags &= ~(EDGE_TRUE_VALUE | EDGE_FALSE_VALUE);
> +
> +  if (initial_true)
> +    {
> +      exit->flags |= EDGE_FALSE_VALUE;
> +      stay->flags |= EDGE_TRUE_VALUE;
> +    }
> +  else
> +    {
> +      exit->flags |= EDGE_TRUE_VALUE;
> +      stay->flags |= EDGE_FALSE_VALUE;
> +    }
> +}
> +
> +/* Give an induction variable GUARD_IV, and its affine descriptor IV,
> +   find the loop phi node in LOOP defining it directly, or create
> +   such phi node.  Return that phi node.  */
> +
> +static gphi *
> +find_or_create_guard_phi (struct loop *loop, tree guard_iv, affine_iv * 
> /*iv*/)
> +{
> +  gimple *def = SSA_NAME_DEF_STMT (guard_iv);
> +  gphi *phi;
> +  if ((phi = dyn_cast <gphi *> (def))
> +      && gimple_bb (phi) == loop->header)
> +    return phi;
> +
> +  /* XXX Create the PHI instead.  */
> +  return NULL;
> +}
> +
> +/* Checks if LOOP contains an conditional block whose condition
> +   depends on which side in the iteration space it is, and if so
> +   splits the iteration space into two loops.  Returns true if the
> +   loop was split.  NITER must contain the iteration descriptor for the
> +   single exit of LOOP.  */
> +
> +static bool
> +split_loop (struct loop *loop, struct tree_niter_desc *niter)
> +{
> +  basic_block *bbs;
> +  unsigned i;
> +  bool changed = false;
> +  tree guard_iv;
> +  tree border;
> +  affine_iv iv;
> +
> +  bbs = get_loop_body (loop);
> +
> +  /* Find a splitting opportunity.  */
> +  for (i = 0; i < loop->num_nodes; i++)
> +    if ((guard_iv = split_at_bb_p (loop, bbs[i], &border, &iv)))
> +      {
> +       /* Handling opposite steps is not implemented yet.  Neither
> +          is handling different step sizes.  */
> +       if ((tree_int_cst_sign_bit (iv.step)
> +            != tree_int_cst_sign_bit (niter->control.step))
> +           || !tree_int_cst_equal (iv.step, niter->control.step))
> +         continue;
> +
> +       /* Find a loop PHI node that defines guard_iv directly,
> +          or create one doing that.  */
> +       gphi *phi = find_or_create_guard_phi (loop, guard_iv, &iv);
> +       if (!phi)
> +         continue;
> +       gcond *guard_stmt = as_a<gcond *> (last_stmt (bbs[i]));
> +       tree guard_init = PHI_ARG_DEF_FROM_EDGE (phi,
> +                                                loop_preheader_edge (loop));
> +       enum tree_code guard_code = gimple_cond_code (guard_stmt);
> +
> +       /* Loop splitting is implemented by versioning the loop, placing
> +          the new loop in front of the old loop, make the first loop iterate
> +          as long as the conditional stays true (or false) and let the
> +          second (original) loop handle the rest of the iterations.
> +
> +          First we need to determine if the condition will start being true
> +          or false in the first loop.  */
> +       bool initial_true;
> +       switch (guard_code)
> +         {
> +           case LT_EXPR:
> +           case LE_EXPR:
> +             initial_true = !tree_int_cst_sign_bit (iv.step);
> +             break;
> +           case GT_EXPR:
> +           case GE_EXPR:
> +             initial_true = tree_int_cst_sign_bit (iv.step);
> +             break;
> +           default:
> +             gcc_unreachable ();
> +         }
> +
> +       /* Build a condition that will skip the first loop when the
> +          guard condition won't ever be true (or false).  */
> +       tree cond = build2 (guard_code, boolean_type_node, guard_init, 
> border);
> +       if (initial_true)
> +         cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond);
> +
> +       /* Now version the loop, we will then have this situation:
> +          if (!cond)
> +            for (...) {body}   //floop
> +          else
> +            for (...) {body}   //loop
> +          join:  */
> +       initialize_original_copy_tables ();
> +       basic_block cond_bb;
> +       struct loop *floop = loop_version (loop, cond, &cond_bb,
> +                                          REG_BR_PROB_BASE, REG_BR_PROB_BASE,
> +                                          REG_BR_PROB_BASE, false);
> +       gcc_assert (floop);
> +       update_ssa (TODO_update_ssa);
> +
> +       /* Now diddle the exit edge of the first loop (floop->join in the
> +          above) to either go to the common exit (join) or to the second
> +          loop, depending on if there are still iterations left, or not.
> +          We split the floop exit edge and insert a copy of the
> +          original exit expression into the new block, that either
> +          skips the second loop or goes to it.  */
> +       edge exit = single_exit (floop);
> +       basic_block skip_bb = split_edge (exit);
> +       gcond *skip_stmt;
> +       gimple_stmt_iterator gsi;
> +       edge new_e, skip_e;
> +
> +       gimple *stmt = last_stmt (exit->src);
> +       skip_stmt = gimple_build_cond (gimple_cond_code (stmt),
> +                                      gimple_cond_lhs (stmt),
> +                                      gimple_cond_rhs (stmt),
> +                                      NULL_TREE, NULL_TREE);
> +       gsi = gsi_last_bb (skip_bb);
> +       gsi_insert_after (&gsi, skip_stmt, GSI_NEW_STMT);
> +
> +       skip_e = EDGE_SUCC (skip_bb, 0);
> +       skip_e->flags &= ~EDGE_FALLTHRU;
> +       new_e = make_edge (skip_bb, loop_preheader_edge (loop)->src, 0);
> +       if (exit->flags & EDGE_TRUE_VALUE)
> +         {
> +           skip_e->flags |= EDGE_TRUE_VALUE;
> +           new_e->flags |= EDGE_FALSE_VALUE;
> +         }
> +       else
> +         {
> +           skip_e->flags |= EDGE_FALSE_VALUE;
> +           new_e->flags |= EDGE_TRUE_VALUE;
> +         }
> +
> +       new_e->count = skip_bb->count;
> +       new_e->probability = PROB_LIKELY;
> +       new_e->count = apply_probability (skip_e->count, PROB_LIKELY);
> +       skip_e->count -= new_e->count;
> +       skip_e->probability = inverse_probability (PROB_LIKELY);
> +
> +       /* Now we have created this situation:
> +            if (!cond) {
> +              for (...) {body; if (cexit) break;}
> +              if (!cexit) goto second;
> +            } else {
> +              second:
> +              for (...) {body; if (cexit) break;}
> +            }
> +            join:
> +
> +          The second loop can now be entered by skipping the first
> +          loop (the inital values of its PHI nodes will be the
> +          original initial values), or by falling in from the first
> +          loop (the initial values will be the continuation values
> +          from the first loop).  Insert PHI nodes reflecting this
> +          in the pre-header of the second loop.  */
> +
> +       basic_block rest = loop_preheader_edge (loop)->src;
> +       edge skip_first = find_edge (cond_bb, rest);
> +       gcc_assert (skip_first);
> +
> +       edge firste = loop_preheader_edge (floop);
> +       edge seconde = loop_preheader_edge (loop);
> +       edge firstn = loop_latch_edge (floop);
> +       gphi *new_guard_phi = 0;
> +       gphi_iterator psi_first, psi_second;
> +       for (psi_first = gsi_start_phis (floop->header),
> +            psi_second = gsi_start_phis (loop->header);
> +            !gsi_end_p (psi_first);
> +            gsi_next (&psi_first), gsi_next (&psi_second))
> +         {
> +           tree init, next, new_init;
> +           use_operand_p op;
> +           gphi *phi_first = psi_first.phi ();
> +           gphi *phi_second = psi_second.phi ();
> +
> +           if (phi_second == phi)
> +             new_guard_phi = phi_first;
> +
> +           init = PHI_ARG_DEF_FROM_EDGE (phi_first, firste);
> +           next = PHI_ARG_DEF_FROM_EDGE (phi_first, firstn);
> +           op = PHI_ARG_DEF_PTR_FROM_EDGE (phi_second, seconde);
> +           gcc_assert (operand_equal_for_phi_arg_p (init, USE_FROM_PTR 
> (op)));
> +
> +           /* Prefer using original variable as a base for the new ssa name.
> +              This is necessary for virtual ops, and useful in order to avoid
> +              losing debug info for real ops.  */
> +           if (TREE_CODE (next) == SSA_NAME
> +               && useless_type_conversion_p (TREE_TYPE (next),
> +                                             TREE_TYPE (init)))
> +             new_init = copy_ssa_name (next);
> +           else if (TREE_CODE (init) == SSA_NAME
> +                    && useless_type_conversion_p (TREE_TYPE (init),
> +                                                  TREE_TYPE (next)))
> +             new_init = copy_ssa_name (init);
> +           else if (useless_type_conversion_p (TREE_TYPE (next),
> +                                               TREE_TYPE (init)))
> +             new_init = make_temp_ssa_name (TREE_TYPE (next), NULL,
> +                                            "unrinittmp");
> +           else
> +             new_init = make_temp_ssa_name (TREE_TYPE (init), NULL,
> +                                            "unrinittmp");
> +
> +           gphi * newphi = create_phi_node (new_init, rest);
> +           add_phi_arg (newphi, init, skip_first, UNKNOWN_LOCATION);
> +           add_phi_arg (newphi, next, new_e, UNKNOWN_LOCATION);
> +           SET_USE (op, new_init);
> +         }
> +
> +       /* The iterations of the second loop is now already
> +          exactly those that the first loop didn't do, but the
> +          iteration space of the first loop is still the original one.
> +          Build a new one, exactly covering those iterations where
> +          the conditional is true (or false).  For example, from such a loop:
> +
> +            for (i = beg, j = beg2; i < end; i++, j++)
> +              if (j < c)  // this is supposed to be true
> +                ...
> +
> +          we build new bounds and change the exit condtions such that
> +          it's effectively this:
> +
> +            newend = min (end+beg2-beg, c)
> +            for (i = beg; j = beg2; j < newend; i++, j++)
> +              if (j < c)
> +                ...
> +
> +          Depending on the direction of the IVs and if the exit tests
> +          are strict or include equality we need to use MIN or MAX,
> +          and add or subtract 1.  */
> +
> +       gimple_seq stmts = NULL;
> +       /* The niter structure contains the after-increment IV, we need
> +          the loop-enter base, so subtract STEP once.  */
> +       tree controlbase = force_gimple_operand (niter->control.base,
> +                                                &stmts, true, NULL_TREE);
> +       tree controlstep = niter->control.step;
> +       tree enddiff;
> +       if (POINTER_TYPE_P (TREE_TYPE (controlbase)))
> +         {
> +           controlstep = gimple_build (&stmts, NEGATE_EXPR,
> +                                       TREE_TYPE (controlstep), controlstep);
> +           enddiff = gimple_build (&stmts, POINTER_PLUS_EXPR,
> +                                   TREE_TYPE (controlbase),
> +                                   controlbase, controlstep);
> +         }
> +       else
> +         enddiff = gimple_build (&stmts, MINUS_EXPR,
> +                                 TREE_TYPE (controlbase),
> +                                 controlbase, controlstep);
> +
> +       /* Compute beg-beg2.  */
> +       if (POINTER_TYPE_P (TREE_TYPE (enddiff)))
> +         {
> +           tree tem = gimple_convert (&stmts, sizetype, guard_init);
> +           tem = gimple_build (&stmts, NEGATE_EXPR, sizetype, tem);
> +           enddiff = gimple_build (&stmts, POINTER_PLUS_EXPR,
> +                                   TREE_TYPE (enddiff),
> +                                   enddiff, tem);
> +         }
> +       else
> +         enddiff = gimple_build (&stmts, MINUS_EXPR, TREE_TYPE (enddiff),
> +                                 enddiff, guard_init);
> +
> +       /* Compute end-(beg-beg2).  */
> +       gimple_seq stmts2;
> +       tree newbound = force_gimple_operand (niter->bound, &stmts2,
> +                                             true, NULL_TREE);
> +       gimple_seq_add_seq_without_update (&stmts, stmts2);
> +
> +       if (POINTER_TYPE_P (TREE_TYPE (enddiff))
> +           || POINTER_TYPE_P (TREE_TYPE (newbound)))
> +         {
> +           enddiff = gimple_convert (&stmts, sizetype, enddiff);
> +           enddiff = gimple_build (&stmts, NEGATE_EXPR, sizetype, enddiff);
> +           newbound = gimple_build (&stmts, POINTER_PLUS_EXPR,
> +                                    TREE_TYPE (newbound),
> +                                    newbound, enddiff);
> +         }
> +       else
> +         newbound = gimple_build (&stmts, MINUS_EXPR, TREE_TYPE (enddiff),
> +                                  newbound, enddiff);
> +
> +       /* Depending on the direction of the IVs the new bound for the first
> +          loop is the minimum or maximum of old bound and border.
> +          Also, if the guard condition isn't strictly less or greater,
> +          we need to adjust the bound.  */
> +       int addbound = 0;
> +       enum tree_code minmax;
> +       if (niter->cmp == LT_EXPR)
> +         {
> +           /* GT and LE are the same, inverted.  */
> +           if (guard_code == GT_EXPR || guard_code == LE_EXPR)
> +             addbound = -1;
> +           minmax = MIN_EXPR;
> +         }
> +       else
> +         {
> +           gcc_assert (niter->cmp == GT_EXPR);
> +           if (guard_code == GE_EXPR || guard_code == LT_EXPR)
> +             addbound = 1;
> +           minmax = MAX_EXPR;
> +         }
> +
> +       if (addbound)
> +         {
> +           tree type2 = TREE_TYPE (newbound);
> +           if (POINTER_TYPE_P (type2))
> +             type2 = sizetype;
> +           newbound = gimple_build (&stmts,
> +                                    POINTER_TYPE_P (TREE_TYPE (newbound))
> +                                    ? POINTER_PLUS_EXPR : PLUS_EXPR,
> +                                    TREE_TYPE (newbound),
> +                                    newbound,
> +                                    build_int_cst (type2, addbound));
> +         }
> +
> +       tree newend = gimple_build (&stmts, minmax, TREE_TYPE (border),
> +                                   border, newbound);
> +       if (stmts)
> +         gsi_insert_seq_on_edge_immediate (loop_preheader_edge (floop),
> +                                           stmts);
> +
> +       /* Now patch the exit block of the first loop to compare
> +          the post-increment value of the guarding IV with the new end
> +          value.  */
> +       tree new_guard_next = PHI_ARG_DEF_FROM_EDGE (new_guard_phi,
> +                                                    loop_latch_edge (floop));
> +       patch_loop_exit (floop, guard_stmt, new_guard_next, newend,
> +                        initial_true);
> +
> +       /* Finally patch out the two copies of the condition to be always
> +          true/false (or opposite).  */
> +       gcond *force_true = as_a<gcond *> (last_stmt (get_bb_copy (bbs[i])));
> +       gcond *force_false = as_a<gcond *> (last_stmt (bbs[i]));
> +       if (!initial_true)
> +         std::swap (force_true, force_false);
> +       gimple_cond_make_true (force_true);
> +       gimple_cond_make_false (force_false);
> +       update_stmt (force_true);
> +       update_stmt (force_false);
> +
> +       free_original_copy_tables ();
> +
> +       /* We destroyed LCSSA form above.  Eventually we might be able
> +          to fix it on the fly, for now simply punt and use the helper.  */
> +       rewrite_into_loop_closed_ssa_1 (NULL, 0, SSA_OP_USE, floop);
> +
> +       changed = true;
> +       if (dump_file && (dump_flags & TDF_DETAILS))
> +         fprintf (dump_file, ";; Loop split.\n");
> +
> +       /* Only deal with the first opportunity.  */
> +       break;
> +      }
> +
> +  free (bbs);
> +  return changed;
> +}
> +
> +/* Main entry point.  Perform loop splitting on all suitable loops.  */
> +
> +static unsigned int
> +tree_ssa_split_loops (void)
> +{
> +  struct loop *loop;
> +  bool changed = false;
> +
> +  gcc_assert (scev_initialized_p ());
> +  /* Go through all loops starting from innermost.  */
> +  FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
> +    {
> +      struct tree_niter_desc niter;
> +      if (single_exit (loop)
> +         /* ??? We could handle non-empty latches when we split
> +            the latch edge (not the exit edge), and put the new
> +            exit condition in the new block.  OTOH this executes some
> +            code unconditionally that might have been skipped by the
> +            original exit before.  */
> +         && empty_block_p (loop->latch)
> +         && !optimize_loop_for_size_p (loop)
> +         && number_of_iterations_exit (loop, single_exit (loop), &niter,
> +                                       false, true)
> +         /* We can't yet handle loops controlled by a != predicate.  */
> +         && niter.cmp != NE_EXPR)
> +       changed |= split_loop (loop, &niter);
> +    }
> +
> +  if (changed)
> +    return TODO_cleanup_cfg;
> +  return 0;
> +}
> +
> +/* Loop splitting pass.  */
> +
> +namespace {
> +
> +const pass_data pass_data_loop_split =
> +{
> +  GIMPLE_PASS, /* type */
> +  "lsplit", /* name */
> +  OPTGROUP_LOOP, /* optinfo_flags */
> +  TV_LOOP_SPLIT, /* tv_id */
> +  PROP_cfg, /* properties_required */
> +  0, /* properties_provided */
> +  0, /* properties_destroyed */
> +  0, /* todo_flags_start */
> +  0, /* todo_flags_finish */
> +};
> +
> +class pass_loop_split : public gimple_opt_pass
> +{
> +public:
> +  pass_loop_split (gcc::context *ctxt)
> +    : gimple_opt_pass (pass_data_loop_split, ctxt)
> +  {}
> +
> +  /* opt_pass methods: */
> +  virtual bool gate (function *) { return optimize >= 2; }
> +  virtual unsigned int execute (function *);
> +
> +}; // class pass_loop_split
> +
> +unsigned int
> +pass_loop_split::execute (function *fun)
> +{
> +  if (number_of_loops (fun) <= 1)
> +    return 0;
> +
> +  return tree_ssa_split_loops ();
> +}
> +
> +} // anon namespace
> +
> +gimple_opt_pass *
> +make_pass_loop_split (gcc::context *ctxt)
> +{
> +  return new pass_loop_split (ctxt);
> +}
> Index: testsuite/gcc.dg/loop-split.c
> ===================================================================
> --- testsuite/gcc.dg/loop-split.c       (revision 0)
> +++ testsuite/gcc.dg/loop-split.c       (working copy)
> @@ -0,0 +1,141 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2 -fdump-tree-lsplit-details" } */
> +
> +#ifdef __cplusplus
> +extern "C" int printf (const char *, ...);
> +extern "C" void abort (void);
> +#else
> +extern int printf (const char *, ...);
> +extern void abort (void);
> +#endif
> +
> +#ifndef TRACE
> +#define TRACE 0
> +#endif
> +
> +#define loop(beg,step,beg2,cond1,cond2) \
> +    do \
> +      { \
> +       sum = 0; \
> +        for (i = (beg), j = (beg2); (cond1); i+=(step),j+=(step)) \
> +          { \
> +            if (cond2) { \
> +             if (TRACE > 1) printf ("a: %d %d\n", i, j); \
> +              sum += a[i]; \
> +           } else { \
> +             if (TRACE > 1) printf ("b: %d %d\n", i, j); \
> +              sum += b[i]; \
> +           } \
> +          } \
> +       if (TRACE > 0) printf ("sum: %d\n", sum); \
> +       check = check * 47 + sum; \
> +      } while (0)
> +
> +#if 1
> +unsigned __attribute__((noinline, noclone)) dotest (int beg, int end, int 
> step,
> +                                              int c, int *a, int *b, int 
> beg2)
> +{
> +  unsigned check = 0;
> +  int sum;
> +  int i, j;
> +  loop (beg, 1, beg2, i < end, j < c);
> +  loop (beg, 1, beg2, i <= end, j < c);
> +  loop (beg, 1, beg2, i < end, j <= c);
> +  loop (beg, 1, beg2, i <= end, j <= c);
> +  loop (beg, 1, beg2, i < end, j > c);
> +  loop (beg, 1, beg2, i <= end, j > c);
> +  loop (beg, 1, beg2, i < end, j >= c);
> +  loop (beg, 1, beg2, i <= end, j >= c);
> +  beg2 += end-beg;
> +  loop (end, -1, beg2, i >= beg, j >= c);
> +  loop (end, -1, beg2, i >= beg, j > c);
> +  loop (end, -1, beg2, i > beg, j >= c);
> +  loop (end, -1, beg2, i > beg, j > c);
> +  loop (end, -1, beg2, i >= beg, j <= c);
> +  loop (end, -1, beg2, i >= beg, j < c);
> +  loop (end, -1, beg2, i > beg, j <= c);
> +  loop (end, -1, beg2, i > beg, j < c);
> +  return check;
> +}
> +
> +#else
> +
> +int __attribute__((noinline, noclone)) f (int beg, int end, int step,
> +                                         int c, int *a, int *b, int beg2)
> +{
> +  int sum = 0;
> +  int i, j;
> +  //for (i = beg, j = beg2; i < end; i += 1, j++ /*step*/)
> +  for (i = end, j = beg2 + (end-beg); i > beg; i += -1, j-- /*step*/)
> +    {
> +      // i - j == X --> i = X + j
> +      // --> i < end == X+j < end == j < end - X
> +      // --> newend = end - (i_init - j_init)
> +      // j < end-X && j < c --> j < min(end-X,c)
> +      // j < end-X && j <= c --> j <= min(end-X-1,c) or j < 
> min(end-X,c+1{OF!})
> +      //if (j < c)
> +      if (j >= c)
> +       printf ("a: %d %d\n", i, j);
> +      /*else
> +       printf ("b: %d %d\n", i, j);*/
> +       /*sum += a[i];
> +      else
> +       sum += b[i];*/
> +    }
> +  return sum;
> +}
> +
> +int __attribute__((noinline, noclone)) f2 (int *beg, int *end, int step,
> +                                         int *c, int *a, int *b, int *beg2)
> +{
> +  int sum = 0;
> +  int *i, *j;
> +  for (i = beg, j = beg2; i < end; i += 1, j++ /*step*/)
> +    {
> +      if (j <= c)
> +       printf ("%d %d\n", i - beg, j - beg);
> +       /*sum += a[i];
> +      else
> +       sum += b[i];*/
> +    }
> +  return sum;
> +}
> +#endif
> +
> +extern int printf (const char *, ...);
> +
> +int main ()
> +{
> +  int a[] = {0,0,0,0,0, 1,2,3,4,5,6,7,8,9,          0,0,0,0,0};
> +  int b[] = {0,0,0,0,0, -1,-2,-3,-4,-5,-6,-7,-8,-9, 0,0,0,0,0,};
> +  int c;
> +  int diff = 0;
> +  unsigned check = 0;
> +  //dotest (0, 9, 1, -1, a+5, b+5, -1);
> +  //return 0;
> +  //f (0, 9, 1, -1, a+5, b+5, -1);
> +  //return 0;
> +  for (diff = -5; diff <= 5; diff++)
> +    {
> +      for (c = -1; c <= 10; c++)
> +       {
> +#if 0
> +         int s = f (0, 9, 1, c, a+5, b+5, diff);
> +         //int s = f2 (a+0, a+9, 1, a+c, a+5, b+5, a+diff);
> +         printf ("%d ", s);
> +#else
> +         if (TRACE > 0)
> +           printf ("check %d %d\n", c, diff);
> +         check = check * 51 + dotest (0, 9, 1, c, a+5, b+5, diff);
> +#endif
> +       }
> +      //printf ("\n");
> +    }
> +  //printf ("%u\n", check);
> +  if (check != 3213344948)
> +    abort ();
> +  return 0;
> +}
> +
> +/* All 16 loops in dotest should be split.  */
> +/* { dg-final { scan-tree-dump-times "Loop split" 16 "lsplit" } } */

Re: Gimple loop splitting

Reply via email to