> -----Original Message-----
> From: Richard Biener <rguent...@suse.de>
> Sent: Tuesday, November 7, 2023 3:04 PM
> To: Tamar Christina <tamar.christ...@arm.com>
> Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; j...@ventanamicro.com
> Subject: Re: [PATCH 5/21]middle-end: update vectorizer's control update to
> support picking an exit other than loop latch
> 
> On Mon, 6 Nov 2023, Tamar Christina wrote:
> 
> > Hi All,
> >
> > As requested, the vectorizer is now free to pick it's own exit which
> > can be different than what the loop CFG infrastucture uses.  The
> > vectorizer makes use of this to vectorize loops that it previously could 
> > not.
> >
> > But this means that loop control must be materialized in the block
> > that needs it less we corrupt the SSA chain.  This makes it so we use
> > the vectorizer's main IV block instead of the loop infra.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> >
> > Ok for master?
> >
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> >     * tree-ssa-loop-manip.cc (standard_iv_increment_position):
> Conditionally
> >     take dest BB.
> >     * tree-ssa-loop-manip.h (standard_iv_increment_position): Likewise.
> >     * tree-vect-loop-manip.cc (vect_set_loop_controls_directly): Use it.
> >     (vect_set_loop_condition_partial_vectors_avx512): Likewise.
> >     (vect_set_loop_condition_normal): Likewise.
> >
> > --- inline copy of patch --
> > diff --git a/gcc/tree-ssa-loop-manip.h b/gcc/tree-ssa-loop-manip.h
> > index
> >
> bda09f51d5619420331c513a9906831c779fd2b4..5938588c8882d842b00
> 301423df1
> > 11cbe7bf7ba8 100644
> > --- a/gcc/tree-ssa-loop-manip.h
> > +++ b/gcc/tree-ssa-loop-manip.h
> > @@ -38,7 +38,8 @@ extern basic_block split_loop_exit_edge (edge, bool
> > = false);  extern basic_block ip_end_pos (class loop *);  extern
> > basic_block ip_normal_pos (class loop *);  extern void
> > standard_iv_increment_position (class loop *,
> > -                                       gimple_stmt_iterator *, bool *);
> > +                                       gimple_stmt_iterator *, bool *,
> > +                                       basic_block = NULL);
> >  extern bool
> >  gimple_duplicate_loop_body_to_header_edge (class loop *, edge, unsigned
> int,
> >                                        sbitmap, edge, vec<edge> *, int); 
> > diff
> --git
> > a/gcc/tree-ssa-loop-manip.cc b/gcc/tree-ssa-loop-manip.cc index
> >
> e7436915e01297e7af2a3bcf1afd01e014de6f32..bdc7a3d74a788f450ca5d
> de6c294
> > 92ce4d4e4550 100644
> > --- a/gcc/tree-ssa-loop-manip.cc
> > +++ b/gcc/tree-ssa-loop-manip.cc
> > @@ -792,14 +792,19 @@ ip_normal_pos (class loop *loop)
> >
> >  /* Stores the standard position for induction variable increment in LOOP
> >     (just before the exit condition if it is available and latch block is 
> > empty,
> > -   end of the latch block otherwise) to BSI.  INSERT_AFTER is set to true 
> > if
> > -   the increment should be inserted after *BSI.  */
> > +   end of the latch block otherwise) to BSI.  If DEST_BB is specified then 
> > that
> > +   basic block is used as the destination instead of the loop latch source
> > +   block.  INSERT_AFTER is set to true if the increment should be inserted
> after
> > +   *BSI.  */
> >
> >  void
> >  standard_iv_increment_position (class loop *loop, gimple_stmt_iterator
> *bsi,
> > -                           bool *insert_after)
> > +                           bool *insert_after, basic_block dest_bb)
> >  {
> > -  basic_block bb = ip_normal_pos (loop), latch = ip_end_pos (loop);
> > +  basic_block bb = dest_bb;
> > +  if (!bb)
> > +    bb = ip_normal_pos (loop);
> > +  basic_block latch = ip_end_pos (loop);
> 
> I don't think that's a good API extension.  Given that we don't support an 
> early
> exit after the main IV exit doesn't this code already work fine as-is?  It 
> chooses
> the last exit.  The position is also not semantically relevant, we just try 
> to keep
> the latch empty here (that is, it's a bit of a "bad" API).
> 
> So, do you really need this change?

Yes I do, If you look at these kinds of loops 
https://gist.github.com/Mistuke/66f14fe5c1be32b91ce149bd9b8bb35f

You'll see that the main exit, i.e. the one attached to the latch block is the 
early break.
Because SCEV can't analyze it picks the main exit to be the one in BB4.

This means that the loop control must be placed in BB4.  If we place ivtmp_10 = 
ivtmp_9 - 1
In BB 3 then that's broken SSA.  If we use `ivtmp_9` in BB4 then we'll have an 
off by one issue.

You could have reached the end of the valid range for the loop when you 
re-enter BB4, since
loads are still allowed you'll then read out of bounds before checking that you 
exit.

This is also annoyingly hard to get correct, which Is what took me a long time. 
 Such loops mean
You need to restart the scalar loop at i_7 if you take the main exit.

Regards,
Tamar

> 
> Maybe we're really using standard_iv_increment_position wrong here, the
> result is supposed to _only_ feed the PHI latch argument.
> Richard.
> 
> >    gimple *last = last_nondebug_stmt (latch);
> >
> >    if (!bb
> > diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
> > index
> >
> 6fbb5b80986fd657814b48eb009b52b094f331e6..3d59119787d6afdc5a64
> 65a547d1
> > ea2d3d940373 100644
> > --- a/gcc/tree-vect-loop-manip.cc
> > +++ b/gcc/tree-vect-loop-manip.cc
> > @@ -531,7 +531,8 @@ vect_set_loop_controls_directly (class loop *loop,
> loop_vec_info loop_vinfo,
> >    tree index_before_incr, index_after_incr;
> >    gimple_stmt_iterator incr_gsi;
> >    bool insert_after;
> > -  standard_iv_increment_position (loop, &incr_gsi, &insert_after);
> > +  edge exit_e = LOOP_VINFO_IV_EXIT (loop_vinfo);
> > + standard_iv_increment_position (loop, &incr_gsi, &insert_after,
> > + exit_e->src);
> >    if (LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo))
> >      {
> >        /* Create an IV that counts down from niters_total and whose
> > step @@ -1017,7 +1018,8 @@
> vect_set_loop_condition_partial_vectors_avx512 (class loop *loop,
> >    tree index_before_incr, index_after_incr;
> >    gimple_stmt_iterator incr_gsi;
> >    bool insert_after;
> > -  standard_iv_increment_position (loop, &incr_gsi, &insert_after);
> > +  standard_iv_increment_position (loop, &incr_gsi, &insert_after,
> > +                             exit_edge->src);
> >    create_iv (niters_adj, MINUS_EXPR, iv_step, NULL_TREE, loop,
> >          &incr_gsi, insert_after, &index_before_incr,
> >          &index_after_incr);
> > @@ -1185,7 +1187,7 @@ vect_set_loop_condition_partial_vectors_avx512
> (class loop *loop,
> >     loop handles exactly VF scalars per iteration.  */
> >
> >  static gcond *
> > -vect_set_loop_condition_normal (loop_vec_info /* loop_vinfo */, edge
> > exit_edge,
> > +vect_set_loop_condition_normal (loop_vec_info loop_vinfo, edge
> > +exit_edge,
> >                             class loop *loop, tree niters, tree step,
> >                             tree final_iv, bool niters_maybe_zero,
> >                             gimple_stmt_iterator loop_cond_gsi) @@ -
> 1278,7 +1280,8 @@
> > vect_set_loop_condition_normal (loop_vec_info /* loop_vinfo */, edge
> exit_edge,
> >     }
> >      }
> >
> > -  standard_iv_increment_position (loop, &incr_gsi, &insert_after);
> > +  standard_iv_increment_position (loop, &incr_gsi, &insert_after,
> > +                             exit_edge->src);
> >    create_iv (init, PLUS_EXPR, step, NULL_TREE, loop,
> >               &incr_gsi, insert_after, &indx_before_incr, &indx_after_incr);
> >    indx_after_incr = force_gimple_operand_gsi (&loop_cond_gsi,
> > indx_after_incr,
> >
> >
> >
> >
> >
> 
> --
> Richard Biener <rguent...@suse.de>
> SUSE Software Solutions Germany GmbH,
> Frankenstrasse 146, 90461 Nuernberg, Germany;
> GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG
> Nuernberg)

Reply via email to