> -----Original Message----- > From: Richard Biener <[email protected]> > Sent: 28 October 2025 08:00 > To: Tamar Christina <[email protected]> > Cc: [email protected]; nd <[email protected]>; [email protected] > Subject: RE: [PATCH][vect]: Fix LCSSA wiring during peeling of multiple-exit > loops > > On Mon, 27 Oct 2025, Tamar Christina wrote: > > > > -----Original Message----- > > > From: Richard Biener <[email protected]> > > > Sent: 27 October 2025 14:27 > > > To: Tamar Christina <[email protected]> > > > Cc: [email protected]; nd <[email protected]>; [email protected] > > > Subject: RE: [PATCH][vect]: Fix LCSSA wiring during peeling of > > > multiple-exit > > > loops > > > > > > On Mon, 27 Oct 2025, Tamar Christina wrote: > > > > > > > > -----Original Message----- > > > > > From: Richard Biener <[email protected]> > > > > > Sent: 27 October 2025 13:37 > > > > > To: Tamar Christina <[email protected]> > > > > > Cc: [email protected]; nd <[email protected]>; > [email protected] > > > > > Subject: Re: [PATCH][vect]: Fix LCSSA wiring during peeling of > > > > > multiple- > exit > > > > > loops > > > > > > > > > > On Sat, 25 Oct 2025, Tamar Christina wrote: > > > > > > > > > > > Arg.. > > > > > > > > > > > > Just noticed a typo.. > > > > > > > > > > > > BB 3: > > > > > > i_2 = PHI <i_1(2)> > > > > > > > > > > > > Should be > > > > > > > > > > > > > > > > > > BB 3: > > > > > > i_2 = PHI <i_0(2)> > > > > > > > > > > > > Rest is OK. Sorry about that > > > > > > > > > > > > ________________________________ > > > > > > From: Tamar Christina > > > > > > Sent: Saturday, October 25, 2025 10:52 AM > > > > > > To: [email protected] <[email protected]> > > > > > > Cc: nd <[email protected]>; [email protected] <[email protected]>; > > > > > [email protected] <[email protected]> > > > > > > Subject: [PATCH][vect]: Fix LCSSA wiring during peeling of multiple- > exit > > > loops > > > > > > > > > > > > Consider this Fortran testcase: > > > > > > > > > > > > integer (8) b, c > > > > > > integer d > > > > > > c = 10 > > > > > > d = 2 > > > > > > call e ((/ (b, b = j, c, d) /), 0_8, c, d + 0_8) > > > > > > contains > > > > > > subroutine e (a, f, g, h) > > > > > > integer (8), dimension (:) :: a > > > > > > integer (8) f, g, h > > > > > > i = 1 > > > > > > do b = f, g, h > > > > > > if (a (i) .ne. b) STOP > > > > > > i = i + 1 > > > > > > end do > > > > > > if (size (a) .ne. i - 1) STOP 2 > > > > > > end > > > > > > end > > > > > > > > > > > > which is a reduced example from the testsuite so not added as a new > > > test. > > > > > > > > > > > > This testcase starts the loop at i = 1, but no peeling needs to be > > > > > > done. > > > > > > As a result the loop form looks like > > > > > > > > > > > > BB 1: > > > > > > i_0 = PHI <1(initial), i_1(latch)>; > > > > > > > > > > > > if (<early-break-condition>) > > > > > > ... > > > > > > else > > > > > > goto BB 2; > > > > > > > > > > > > BB 2: > > > > > > i_1 = i_0 + 1; > > > > > > if (<main-exit>) > > > > > > goto BB 3; > > > > > > goto BB 1; > > > > > > > > > > > > BB 3: > > > > > > i_2 = PHI <i_1(2)>; > > > > > > > > > > > > Basically in the main IV exit we don't use the value *after* > incrementing > > > but > > > > > > rather the value before incrementing. > > > > > > > > > > > > At the moment during peeling we wipe away this information and > replace > > > in > > > > > the > > > > > > exit the value of i_1. This is incorrect and means we can no > > > > > > longer tell > > > > > > which value of the IV we should actually be calculating for that > > > > > > exit. > > > > > > > > > > > > This isn't triggered often on trunk because of an artificial > > > > > > inflation of > the > > > VF > > > > > > which causes the main exist not be reached very often and such loops > > > aren't > > > > > > common. However with the scalar IV code this issue was brought to > > > light. > > > > > > > > > > > > This patch preserves the value of the LC PHI set during edge > redirection. > > > We > > > > > > end up still creating the new PHI but it uses the correct value now. > The > > > > > reason > > > > > > for still creating the duplicate PHI is because there are code > > > > > > outside of > > > > > > peeling that today depends on the PHI node ordering. So this was a > > > > > conservative > > > > > > fix. The duplicate PHI gets elided later as it'll be unused. > > > > > > > > > > > > Bootstrapped Regtested on aarch64-none-linux-gnu, > > > > > > arm-none-linux-gnueabihf, x86_64-pc-linux-gnu > > > > > > -m32, -m64 and no issues > > > > > > > > > > > > Ok for master? > > > > > > > > > > > > Thanks, > > > > > > Tamar > > > > > > > > > > > > gcc/ChangeLog: > > > > > > > > > > > > * tree-vect-loop-manip.cc > (slpeel_tree_duplicate_loop_to_edge_cfg): > > > Fix > > > > > > up PHI in exit nodes. > > > > > > > > > > > > --- > > > > > > diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop- > manip.cc > > > > > > index > > > > > > > > > 20141dbc2e54d49bca00a2e75a6733d181e05ed1..96ca273c24680556f16c > > > > > dc9e465f490d7fcdb8a4 100644 > > > > > > --- a/gcc/tree-vect-loop-manip.cc > > > > > > +++ b/gcc/tree-vect-loop-manip.cc > > > > > > @@ -1755,6 +1755,22 @@ slpeel_tree_duplicate_loop_to_edge_cfg > > > (class > > > > > loop *loop, edge loop_exit, > > > > > > new_arg = PHI_ARG_DEF_FROM_EDGE (from_phi, > > > > > > > > > > > > loop_latch_edge (loop)); > > > > > > > > > > > > + /* Some loops start at a non-zero index, which > > > > > > then require > > > > > > + during redirection to return i instead of i + > > > > > > 1. Edge > > > > > > + redirection already deals with this, but we > > > > > > need to make > > > > > > + sure we don't override the value it puts in > > > > > > for these > > > > > > + cases. */ > > > > > > + imm_use_iterator imm_iter; > > > > > > + use_operand_p use_p; > > > > > > + tree from_def = PHI_RESULT (from_phi); > > > > > > + bool abnormal_exit_condition = false; > > > > > > + FOR_EACH_IMM_USE_FAST (use_p, imm_iter, from_def) > > > > > > + if (gimple_bb (USE_STMT (use_p)) == > main_loop_exit_block) > > > > > > + { > > > > > > + abnormal_exit_condition = true; > > > > > > + break; > > > > > > + } > > > > > > + > > > > > > > > > > I'm a bit confused - isn't this exactly what 'peeled_iters' > > > > > (aka LOOP_VINFO_EARLY_BREAKS_VECT_PEELED) is about? That is, > > > > > this loop is supposed to handle the main exit (and creating > > > > > all required merge PHIs and eventually missing LC PHIs), so > > > > > your patch suggests we put in the "wrong" result, but you > > > > > leave the old 'peeled_iters' code around, even using the > > > > > original 'caching' (but then for the wrong value?). > > > > > > > > > > > > > No, peeled_iters means we want the value at the start of the vector > > > iteration. > > > > This issue is that we want the value at the end of the vector iteration > > > > - > > > <offset>. > > > > > > > > So they aren't the same. In a normal loop, this adjustment is > > > > calculated > by > > > > vect_update_ivs_after_vectorizer through niters because it has the right > PHI > > > node > > > > it doesn't use the incorrect final value of i to adjust the count. > > > > > > > > > At the same time the 2nd loop filling the alternate exit cases > > > > > is always correct? > > > > > > > > Yes, because alternate exits always restart the loop from the start, so > > > > the > > > reduction > > > > for it takes you before you do any adjustment at all for i. > > > > > > So I belive the logic > > > > > > /* When the vector loop is peeled then we need to use > > > the > > > value at start of the loop, otherwise the main loop > > > exit > > > should use the final iter value. */ > > > tree new_arg; > > > if (peeled_iters) > > > new_arg = gimple_phi_result (from_phi); > > > else > > > new_arg = PHI_ARG_DEF_FROM_EDGE (from_phi, > > > loop_latch_edge > > > (loop)); > > > > > > for !peeled_iters is correct. In this place we are looking for the > > > initial value to use for the epilog loop - but in this case this > > > value is different from the actual live value computed by the testcase. > > > Note there is in general no way to associate the LC PHI [def] with > > > the IV in the loop header if it is neither the PHI result nor the > > > latch value - as it is in this case. Instead the LC PHI is for > > > an alltogether different "IV" (but a related, offsetted one). > > > > > > And I see '5' being used as the continuation value via > > > vect_update_ivs_after_vectorizer, so that seems correct. > > > > > > So, I can't really see what is wrong here? The vectorized code > > > also looks OK. It should be possible to set up a C testcase for > > > this, like > > > > > > > As discussed on IRC, I'll move the fix somewhere else then. But might > > end up being more disruptive. > > I'd like to have a look myself, but for this I have to be able to > reproduce an actual miscompile. So can you please give instructions > on how to arrive at one? What did you reduce the testcase from? > What patches do I need to apply to reproduce, on which target, with > what options? >
https://gcc.gnu.org/pipermail/gcc-patches/2025-October/698584.html https://gcc.gnu.org/pipermail/gcc-patches/2025-October/698586.html This testcase https://gcc.gnu.org/pipermail/gcc-patches/2025-October/698585.html With -march=armv8-a -O3 Tamar > Thanks, > Richard. > > > Thanks, > > Tamar > > > > > int i = 1, j; > > > do > > > { > > > if (...) abort (); > > > j = i + 1; > > > } > > > while (...); > > > ... use i ... > > > > > > Richard. > > > > > > > > > > > > > Does the issue show with the fortran testcase on current trunk? > > > > > > > > > > > > > Yes, the codegen on trunk today is wrong, but the main exit would never > be > > > reached > > > > because of the increased VF. So the tests "passes" because it takes the > early > > > exit. > > > > > > > > Thanks, > > > > Tamar > > > > > > > > > Thanks, > > > > > Richard. > > > > > > > > > > > /* Check if we've already created a new phi node > > > > > > during > edge > > > > > > redirection and re-use it if so. Otherwise > > > > > > create a > > > > > > LC PHI node to feed the merge PHI. */ > > > > > > @@ -1769,6 +1785,11 @@ slpeel_tree_duplicate_loop_to_edge_cfg > > > (class > > > > > loop *loop, edge loop_exit, > > > > > > new_arg = *res; > > > > > > else > > > > > > { > > > > > > + /* If we already found an LC PHI for the > > > > > > current PHI then > > > > > > + just re-use it. */ > > > > > > + if (abnormal_exit_condition) > > > > > > + new_arg = from_def; > > > > > > + > > > > > > /* Create the LC PHI node for the exit. */ > > > > > > tree new_def = copy_ssa_name (new_arg); > > > > > > gphi *lc_phi > > > > > > @@ -1779,7 +1800,7 @@ slpeel_tree_duplicate_loop_to_edge_cfg > > > (class > > > > > loop *loop, edge loop_exit, > > > > > > > > > > > > /* Create the PHI node in the merge block > > > > > > merging the > > > > > > main and early exit values. */ > > > > > > - tree new_res = copy_ssa_name (gimple_phi_result > (from_phi)); > > > > > > + tree new_res = copy_ssa_name (from_def); > > > > > > gphi *lcssa_phi = create_phi_node (new_res, > new_preheader); > > > > > > edge main_e = single_succ_edge > > > > > > (main_loop_exit_block); > > > > > > SET_PHI_ARG_DEF_ON_EDGE (lcssa_phi, main_e, > > > > > > new_arg); > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > > > -- > > > > > Richard Biener <[email protected]> > > > > > SUSE Software Solutions Germany GmbH, > > > > > Frankenstrasse 146, 90461 Nuernberg, Germany; > > > > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG > > > > > Nuernberg) > > > > > > > > > > -- > > > Richard Biener <[email protected]> > > > SUSE Software Solutions Germany GmbH, > > > Frankenstrasse 146, 90461 Nuernberg, Germany; > > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG > > > Nuernberg) > > > > -- > Richard Biener <[email protected]> > SUSE Software Solutions Germany GmbH, > Frankenstrasse 146, 90461 Nuernberg, Germany; > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG > Nuernberg)
