RE: [PATCH][vect]: Fix LCSSA wiring during peeling of multiple-exit loops

Richard Biener Tue, 28 Oct 2025 00:59:56 -0700

On Mon, 27 Oct 2025, Tamar Christina wrote:

> > -----Original Message-----
> > From: Richard Biener <[email protected]>
> > Sent: 27 October 2025 14:27
> > To: Tamar Christina <[email protected]>
> > Cc: [email protected]; nd <[email protected]>; [email protected]
> > Subject: RE: [PATCH][vect]: Fix LCSSA wiring during peeling of multiple-exit
> > loops
> > 
> > On Mon, 27 Oct 2025, Tamar Christina wrote:
> > 
> > > > -----Original Message-----
> > > > From: Richard Biener <[email protected]>
> > > > Sent: 27 October 2025 13:37
> > > > To: Tamar Christina <[email protected]>
> > > > Cc: [email protected]; nd <[email protected]>; [email protected]
> > > > Subject: Re: [PATCH][vect]: Fix LCSSA wiring during peeling of 
> > > > multiple-exit
> > > > loops
> > > >
> > > > On Sat, 25 Oct 2025, Tamar Christina wrote:
> > > >
> > > > > Arg..
> > > > >
> > > > > Just noticed a typo..
> > > > >
> > > > > BB 3:
> > > > >   i_2 = PHI <i_1(2)>
> > > > >
> > > > > Should be
> > > > >
> > > > >
> > > > > BB 3:
> > > > >   i_2 = PHI <i_0(2)>
> > > > >
> > > > > Rest is OK. Sorry about that
> > > > >
> > > > > ________________________________
> > > > > From: Tamar Christina
> > > > > Sent: Saturday, October 25, 2025 10:52 AM
> > > > > To: [email protected] <[email protected]>
> > > > > Cc: nd <[email protected]>; [email protected] <[email protected]>;
> > > > [email protected] <[email protected]>
> > > > > Subject: [PATCH][vect]: Fix LCSSA wiring during peeling of 
> > > > > multiple-exit
> > loops
> > > > >
> > > > > Consider this Fortran testcase:
> > > > >
> > > > > integer (8) b, c
> > > > >   integer  d
> > > > >   c = 10
> > > > >   d = 2
> > > > >   call e ((/ (b, b = j, c, d) /), 0_8, c, d + 0_8)
> > > > > contains
> > > > >   subroutine e (a, f, g, h)
> > > > >     integer (8), dimension (:) :: a
> > > > >     integer (8)  f, g, h
> > > > >     i = 1
> > > > >     do b = f, g, h
> > > > >       if (a (i) .ne. b) STOP
> > > > >       i = i + 1
> > > > >     end do
> > > > >     if (size (a) .ne. i - 1) STOP 2
> > > > >   end
> > > > > end
> > > > >
> > > > > which is a reduced example from the testsuite so not added as a new
> > test.
> > > > >
> > > > > This testcase starts the loop at i = 1, but no peeling needs to be 
> > > > > done.
> > > > > As a result the loop form looks like
> > > > >
> > > > > BB 1:
> > > > >   i_0 = PHI <1(initial), i_1(latch)>;
> > > > >
> > > > >   if (<early-break-condition>)
> > > > >     ...
> > > > >   else
> > > > >     goto BB 2;
> > > > >
> > > > > BB 2:
> > > > >   i_1 = i_0 + 1;
> > > > >   if (<main-exit>)
> > > > >     goto BB 3;
> > > > >   goto BB 1;
> > > > >
> > > > > BB 3:
> > > > >   i_2 = PHI <i_1(2)>;
> > > > >
> > > > > Basically in the main IV exit we don't use the value *after* 
> > > > > incrementing
> > but
> > > > > rather the value before incrementing.
> > > > >
> > > > > At the moment during peeling we wipe away this information and replace
> > in
> > > > the
> > > > > exit the value of i_1.  This is incorrect and means we can no longer 
> > > > > tell
> > > > > which value of the IV we should actually be calculating for that exit.
> > > > >
> > > > > This isn't triggered often on trunk because of an artificial 
> > > > > inflation of the
> > VF
> > > > > which causes the main exist not be reached very often and such loops
> > aren't
> > > > > common.  However with the scalar IV code this issue was brought to
> > light.
> > > > >
> > > > > This patch preserves the value of the LC PHI set during edge 
> > > > > redirection.
> > We
> > > > > end up still creating the new PHI but it uses the correct value now.  
> > > > > The
> > > > reason
> > > > > for still creating the duplicate PHI is because there are code 
> > > > > outside of
> > > > > peeling that today depends on the PHI node ordering.  So this was a
> > > > conservative
> > > > > fix.  The duplicate PHI gets elided later as it'll be unused.
> > > > >
> > > > > Bootstrapped Regtested on aarch64-none-linux-gnu,
> > > > > arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
> > > > > -m32, -m64 and no issues
> > > > >
> > > > > Ok for master?
> > > > >
> > > > > Thanks,
> > > > > Tamar
> > > > >
> > > > > gcc/ChangeLog:
> > > > >
> > > > >         * tree-vect-loop-manip.cc 
> > > > > (slpeel_tree_duplicate_loop_to_edge_cfg):
> > Fix
> > > > >         up PHI in exit nodes.
> > > > >
> > > > > ---
> > > > > diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
> > > > > index
> > > >
> > 20141dbc2e54d49bca00a2e75a6733d181e05ed1..96ca273c24680556f16c
> > > > dc9e465f490d7fcdb8a4 100644
> > > > > --- a/gcc/tree-vect-loop-manip.cc
> > > > > +++ b/gcc/tree-vect-loop-manip.cc
> > > > > @@ -1755,6 +1755,22 @@ slpeel_tree_duplicate_loop_to_edge_cfg
> > (class
> > > > loop *loop, edge loop_exit,
> > > > >                      new_arg = PHI_ARG_DEF_FROM_EDGE (from_phi,
> > > > >                                                       loop_latch_edge 
> > > > > (loop));
> > > > >
> > > > > +                 /* Some loops start at a non-zero index, which then 
> > > > > require
> > > > > +                    during redirection to return i instead of i + 1. 
> > > > >  Edge
> > > > > +                    redirection already deals with this, but we need 
> > > > > to make
> > > > > +                    sure we don't override the value it puts in for 
> > > > > these
> > > > > +                    cases.  */
> > > > > +                 imm_use_iterator imm_iter;
> > > > > +                 use_operand_p use_p;
> > > > > +                 tree from_def = PHI_RESULT (from_phi);
> > > > > +                 bool abnormal_exit_condition = false;
> > > > > +                 FOR_EACH_IMM_USE_FAST (use_p, imm_iter, from_def)
> > > > > +                   if (gimple_bb (USE_STMT (use_p)) == 
> > > > > main_loop_exit_block)
> > > > > +                     {
> > > > > +                       abnormal_exit_condition = true;
> > > > > +                       break;
> > > > > +                     }
> > > > > +
> > > >
> > > > I'm a bit confused - isn't this exactly what 'peeled_iters'
> > > > (aka LOOP_VINFO_EARLY_BREAKS_VECT_PEELED) is about?  That is,
> > > > this loop is supposed to handle the main exit (and creating
> > > > all required merge PHIs and eventually missing LC PHIs), so
> > > > your patch suggests we put in the "wrong" result, but you
> > > > leave the old 'peeled_iters' code around, even using the
> > > > original 'caching' (but then for the wrong value?).
> > > >
> > >
> > > No, peeled_iters means we want the value at the start of the vector
> > iteration.
> > > This issue is that we want the value at the end of the vector iteration -
> > <offset>.
> > >
> > > So they aren't the same.  In a normal loop, this adjustment is calculated 
> > > by
> > > vect_update_ivs_after_vectorizer through niters because it has the right 
> > > PHI
> > node
> > > it doesn't use the incorrect final value of i to adjust the count.
> > >
> > > > At the same time the 2nd loop filling the alternate exit cases
> > > > is always correct?
> > >
> > > Yes, because alternate exits always restart the loop from the start, so 
> > > the
> > reduction
> > > for it takes you before you do any adjustment at all for i.
> > 
> > So I belive the logic
> > 
> >                   /* When the vector loop is peeled then we need to use
> > the
> >                      value at start of the loop, otherwise the main loop
> > exit
> >                      should use the final iter value.  */
> >                   tree new_arg;
> >                   if (peeled_iters)
> >                     new_arg = gimple_phi_result (from_phi);
> >                   else
> >                     new_arg = PHI_ARG_DEF_FROM_EDGE (from_phi,
> >                                                      loop_latch_edge
> > (loop));
> > 
> > for !peeled_iters is correct.  In this place we are looking for the
> > initial value to use for the epilog loop - but in this case this
> > value is different from the actual live value computed by the testcase.
> > Note there is in general no way to associate the LC PHI [def] with
> > the IV in the loop header if it is neither the PHI result nor the
> > latch value - as it is in this case.  Instead the LC PHI is for
> > an alltogether different "IV" (but a related, offsetted one).
> > 
> > And I see '5' being used as the continuation value via
> > vect_update_ivs_after_vectorizer, so that seems correct.
> > 
> > So, I can't really see what is wrong here?  The vectorized code
> > also looks OK.  It should be possible to set up a C testcase for
> > this, like
> > 
> 
> As discussed on IRC, I'll move the fix somewhere else then. But might
> end up being more disruptive.


I'd like to have a look myself, but for this I have to be able to
reproduce an actual miscompile.  So can you please give instructions
on how to arrive at one?  What did you reduce the testcase from?
What patches do I need to apply to reproduce, on which target, with
what options?

Thanks,
Richard.

> Thanks,
> Tamar
> 
> >  int i = 1, j;
> >  do
> >    {
> >      if (...) abort ();
> >      j = i + 1;
> >    }
> >  while (...);
> >  ... use i ...
> > 
> > Richard.
> > 
> > > >
> > > > Does the issue show with the fortran testcase on current trunk?
> > > >
> > >
> > > Yes, the codegen on trunk today is wrong, but the main exit would never be
> > reached
> > > because of the increased VF.  So the tests "passes" because it takes the 
> > > early
> > exit.
> > >
> > > Thanks,
> > > Tamar
> > >
> > > > Thanks,
> > > > Richard.
> > > >
> > > > >                    /* Check if we've already created a new phi node 
> > > > > during edge
> > > > >                       redirection and re-use it if so.  Otherwise 
> > > > > create a
> > > > >                       LC PHI node to feed the merge PHI.  */
> > > > > @@ -1769,6 +1785,11 @@ slpeel_tree_duplicate_loop_to_edge_cfg
> > (class
> > > > loop *loop, edge loop_exit,
> > > > >                      new_arg = *res;
> > > > >                    else
> > > > >                      {
> > > > > +                     /* If we already found an LC PHI for the 
> > > > > current PHI then
> > > > > +                        just re-use it.  */
> > > > > +                     if (abnormal_exit_condition)
> > > > > +                       new_arg = from_def;
> > > > > +
> > > > >                        /* Create the LC PHI node for the exit.  */
> > > > >                        tree new_def = copy_ssa_name (new_arg);
> > > > >                        gphi *lc_phi
> > > > > @@ -1779,7 +1800,7 @@ slpeel_tree_duplicate_loop_to_edge_cfg
> > (class
> > > > loop *loop, edge loop_exit,
> > > > >
> > > > >                    /* Create the PHI node in the merge block merging 
> > > > > the
> > > > >                       main and early exit values.  */
> > > > > -                 tree new_res = copy_ssa_name (gimple_phi_result 
> > > > > (from_phi));
> > > > > +                 tree new_res = copy_ssa_name (from_def);
> > > > >                    gphi *lcssa_phi = create_phi_node (new_res, 
> > > > > new_preheader);
> > > > >                    edge main_e = single_succ_edge 
> > > > > (main_loop_exit_block);
> > > > >                    SET_PHI_ARG_DEF_ON_EDGE (lcssa_phi, main_e, 
> > > > > new_arg);
> > > > >
> > > > >
> > > > > --
> > > > >
> > > >
> > > > --
> > > > Richard Biener <[email protected]>
> > > > SUSE Software Solutions Germany GmbH,
> > > > Frankenstrasse 146, 90461 Nuernberg, Germany;
> > > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG
> > > > Nuernberg)
> > >
> > 
> > --
> > Richard Biener <[email protected]>
> > SUSE Software Solutions Germany GmbH,
> > Frankenstrasse 146, 90461 Nuernberg, Germany;
> > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG
> > Nuernberg)
> 

-- 
Richard Biener <[email protected]>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

RE: [PATCH][vect]: Fix LCSSA wiring during peeling of multiple-exit loops

Reply via email to