On Mon, 13 Nov 2023, Tamar Christina wrote:

> > -----Original Message-----
> > From: Richard Biener <rguent...@suse.de>
> > Sent: Tuesday, November 7, 2023 3:04 PM
> > To: Tamar Christina <tamar.christ...@arm.com>
> > Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; j...@ventanamicro.com
> > Subject: Re: [PATCH 5/21]middle-end: update vectorizer's control update to
> > support picking an exit other than loop latch
> > 
> > On Mon, 6 Nov 2023, Tamar Christina wrote:
> > 
> > > Hi All,
> > >
> > > As requested, the vectorizer is now free to pick it's own exit which
> > > can be different than what the loop CFG infrastucture uses.  The
> > > vectorizer makes use of this to vectorize loops that it previously could 
> > > not.
> > >
> > > But this means that loop control must be materialized in the block
> > > that needs it less we corrupt the SSA chain.  This makes it so we use
> > > the vectorizer's main IV block instead of the loop infra.
> > >
> > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > >
> > > Ok for master?
> > >
> > > Thanks,
> > > Tamar
> > >
> > > gcc/ChangeLog:
> > >
> > >   * tree-ssa-loop-manip.cc (standard_iv_increment_position):
> > Conditionally
> > >   take dest BB.
> > >   * tree-ssa-loop-manip.h (standard_iv_increment_position): Likewise.
> > >   * tree-vect-loop-manip.cc (vect_set_loop_controls_directly): Use it.
> > >   (vect_set_loop_condition_partial_vectors_avx512): Likewise.
> > >   (vect_set_loop_condition_normal): Likewise.
> > >
> > > --- inline copy of patch --
> > > diff --git a/gcc/tree-ssa-loop-manip.h b/gcc/tree-ssa-loop-manip.h
> > > index
> > >
> > bda09f51d5619420331c513a9906831c779fd2b4..5938588c8882d842b00
> > 301423df1
> > > 11cbe7bf7ba8 100644
> > > --- a/gcc/tree-ssa-loop-manip.h
> > > +++ b/gcc/tree-ssa-loop-manip.h
> > > @@ -38,7 +38,8 @@ extern basic_block split_loop_exit_edge (edge, bool
> > > = false);  extern basic_block ip_end_pos (class loop *);  extern
> > > basic_block ip_normal_pos (class loop *);  extern void
> > > standard_iv_increment_position (class loop *,
> > > -                                     gimple_stmt_iterator *, bool *);
> > > +                                     gimple_stmt_iterator *, bool *,
> > > +                                     basic_block = NULL);
> > >  extern bool
> > >  gimple_duplicate_loop_body_to_header_edge (class loop *, edge, unsigned
> > int,
> > >                                      sbitmap, edge, vec<edge> *, int); 
> > > diff
> > --git
> > > a/gcc/tree-ssa-loop-manip.cc b/gcc/tree-ssa-loop-manip.cc index
> > >
> > e7436915e01297e7af2a3bcf1afd01e014de6f32..bdc7a3d74a788f450ca5d
> > de6c294
> > > 92ce4d4e4550 100644
> > > --- a/gcc/tree-ssa-loop-manip.cc
> > > +++ b/gcc/tree-ssa-loop-manip.cc
> > > @@ -792,14 +792,19 @@ ip_normal_pos (class loop *loop)
> > >
> > >  /* Stores the standard position for induction variable increment in LOOP
> > >     (just before the exit condition if it is available and latch block is 
> > > empty,
> > > -   end of the latch block otherwise) to BSI.  INSERT_AFTER is set to 
> > > true if
> > > -   the increment should be inserted after *BSI.  */
> > > +   end of the latch block otherwise) to BSI.  If DEST_BB is specified 
> > > then that
> > > +   basic block is used as the destination instead of the loop latch 
> > > source
> > > +   block.  INSERT_AFTER is set to true if the increment should be 
> > > inserted
> > after
> > > +   *BSI.  */
> > >
> > >  void
> > >  standard_iv_increment_position (class loop *loop, gimple_stmt_iterator
> > *bsi,
> > > -                         bool *insert_after)
> > > +                         bool *insert_after, basic_block dest_bb)
> > >  {
> > > -  basic_block bb = ip_normal_pos (loop), latch = ip_end_pos (loop);
> > > +  basic_block bb = dest_bb;
> > > +  if (!bb)
> > > +    bb = ip_normal_pos (loop);
> > > +  basic_block latch = ip_end_pos (loop);
> > 
> > I don't think that's a good API extension.  Given that we don't support an 
> > early
> > exit after the main IV exit doesn't this code already work fine as-is?  It 
> > chooses
> > the last exit.  The position is also not semantically relevant, we just try 
> > to keep
> > the latch empty here (that is, it's a bit of a "bad" API).
> > 
> > So, do you really need this change?
> 
> Yes I do, If you look at these kinds of loops 
> https://gist.github.com/Mistuke/66f14fe5c1be32b91ce149bd9b8bb35f
> 
> You'll see that the main exit, i.e. the one attached to the latch block is 
> the early break.
> Because SCEV can't analyze it picks the main exit to be the one in BB4.
> 
> This means that the loop control must be placed in BB4.  If we place ivtmp_10 
> = ivtmp_9 - 1
> In BB 3 then that's broken SSA.  If we use `ivtmp_9` in BB4 then we'll have 
> an off by one issue.

OK, but then I think the fix is to not use standard_iv_increment_position
(it's a weird API anyway).  Instead insert before the main exit condition.

Btw, I assumed this order of main / early exit cannot happen.  But I
didn't re-review the main exit identification code yet.

Richard.

> You could have reached the end of the valid range for the loop when you 
> re-enter BB4, since
> loads are still allowed you'll then read out of bounds before checking that 
> you exit.
> 
> This is also annoyingly hard to get correct, which Is what took me a long 
> time.  Such loops mean
> You need to restart the scalar loop at i_7 if you take the main exit.
> 
> Regards,
> Tamar
> 
> > 
> > Maybe we're really using standard_iv_increment_position wrong here, the
> > result is supposed to _only_ feed the PHI latch argument.
> > Richard.
> > 
> > >    gimple *last = last_nondebug_stmt (latch);
> > >
> > >    if (!bb
> > > diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
> > > index
> > >
> > 6fbb5b80986fd657814b48eb009b52b094f331e6..3d59119787d6afdc5a64
> > 65a547d1
> > > ea2d3d940373 100644
> > > --- a/gcc/tree-vect-loop-manip.cc
> > > +++ b/gcc/tree-vect-loop-manip.cc
> > > @@ -531,7 +531,8 @@ vect_set_loop_controls_directly (class loop *loop,
> > loop_vec_info loop_vinfo,
> > >    tree index_before_incr, index_after_incr;
> > >    gimple_stmt_iterator incr_gsi;
> > >    bool insert_after;
> > > -  standard_iv_increment_position (loop, &incr_gsi, &insert_after);
> > > +  edge exit_e = LOOP_VINFO_IV_EXIT (loop_vinfo);
> > > + standard_iv_increment_position (loop, &incr_gsi, &insert_after,
> > > + exit_e->src);
> > >    if (LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo))
> > >      {
> > >        /* Create an IV that counts down from niters_total and whose
> > > step @@ -1017,7 +1018,8 @@
> > vect_set_loop_condition_partial_vectors_avx512 (class loop *loop,
> > >    tree index_before_incr, index_after_incr;
> > >    gimple_stmt_iterator incr_gsi;
> > >    bool insert_after;
> > > -  standard_iv_increment_position (loop, &incr_gsi, &insert_after);
> > > +  standard_iv_increment_position (loop, &incr_gsi, &insert_after,
> > > +                           exit_edge->src);
> > >    create_iv (niters_adj, MINUS_EXPR, iv_step, NULL_TREE, loop,
> > >        &incr_gsi, insert_after, &index_before_incr,
> > >        &index_after_incr);
> > > @@ -1185,7 +1187,7 @@ vect_set_loop_condition_partial_vectors_avx512
> > (class loop *loop,
> > >     loop handles exactly VF scalars per iteration.  */
> > >
> > >  static gcond *
> > > -vect_set_loop_condition_normal (loop_vec_info /* loop_vinfo */, edge
> > > exit_edge,
> > > +vect_set_loop_condition_normal (loop_vec_info loop_vinfo, edge
> > > +exit_edge,
> > >                           class loop *loop, tree niters, tree step,
> > >                           tree final_iv, bool niters_maybe_zero,
> > >                           gimple_stmt_iterator loop_cond_gsi) @@ -
> > 1278,7 +1280,8 @@
> > > vect_set_loop_condition_normal (loop_vec_info /* loop_vinfo */, edge
> > exit_edge,
> > >   }
> > >      }
> > >
> > > -  standard_iv_increment_position (loop, &incr_gsi, &insert_after);
> > > +  standard_iv_increment_position (loop, &incr_gsi, &insert_after,
> > > +                           exit_edge->src);
> > >    create_iv (init, PLUS_EXPR, step, NULL_TREE, loop,
> > >               &incr_gsi, insert_after, &indx_before_incr, 
> > > &indx_after_incr);
> > >    indx_after_incr = force_gimple_operand_gsi (&loop_cond_gsi,
> > > indx_after_incr,
> > >
> > >
> > >
> > >
> > >
> > 
> > --
> > Richard Biener <rguent...@suse.de>
> > SUSE Software Solutions Germany GmbH,
> > Frankenstrasse 146, 90461 Nuernberg, Germany;
> > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG
> > Nuernberg)
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Reply via email to