On Mon, Dec 09, 2024 at 02:07:07PM +0100, Richard Biener wrote: > On Tue, Dec 3, 2024 at 2:42 PM Richard Biener > <richard.guent...@gmail.com> wrote: > > > > On Tue, Dec 3, 2024 at 2:07 PM Lewis Hyatt <lhy...@gmail.com> wrote: > > > > > > On Tue, Dec 03, 2024 at 01:28:28PM +0100, Richard Biener wrote: > > > > On Mon, Dec 2, 2024 at 1:55 AM Lewis Hyatt <lhy...@gmail.com> wrote: > > > > > > > > > > This patch is my new way to handle what was previously done in v2 > > > > > patch > > > > > 04/14, discussed at: > > > > > https://gcc.gnu.org/pipermail/gcc-patches/2024-November/669527.html > > > > > > > > > > The only places I have found running into trouble with reallocated > > > > > PHI nodes > > > > > (at least, the only places revealed by changing the size of > > > > > location_t) are > > > > > those which remember a gphi object before a call to loop_version() > > > > > and then > > > > > try to use it after. I fixed three affected call sites in this new > > > > > patch. > > > > > > > > I guess it might be nice to avoid PHI node re-allocation by > > > > loop_version(), > > > > I remember fixing up other CFG manipulations to avoid the need for > > > > intermittent more PHI arguments. Anway, the patch is OK besides one > > > > minor adjustment, see below ... > > > > > > > > > > Thanks again for looking at all the patches. Comment below... > > > > > > ... > > > > > > > > diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc > > > > > index 54b29540eb5..4f3abb94e6a 100644 > > > > > --- a/gcc/tree-vect-loop-manip.cc > > > > > +++ b/gcc/tree-vect-loop-manip.cc > > > > > @@ -4190,6 +4190,14 @@ vect_loop_versioning (loop_vec_info loop_vinfo, > > > > > prob * prob2, (prob * prob2).invert (), > > > > > prob * prob2, (prob * prob2).invert (), > > > > > true); > > > > > + > > > > > + /* If the PHI nodes in the loop header were reallocated, we > > > > > need to fix up > > > > > + our internally stashed copies of those. */ > > > > > + if (loop_to_version == loop) > > > > > + for (auto gsi = gsi_start_phis (loop->header); > > > > > + !gsi_end_p (gsi); gsi_next (&gsi)) > > > > > + loop_vinfo->resync_stmt_addr (gsi.phi ()); > > > > > > > > Instead of a new resync_stmt_addr method can you simply do > > > > > > > > if (stmt_vec_info s = loop_vinfo->lookup_stmt (gsi.phi ()) > > > > STMT_VINFO_STMT (s) = gsi.phi (); > > > > > > > > please? > > > > > > > > OK with that change. > > > > > > > > Thanks, > > > > Richard. > > > > > > The reason I added the new function is that lookup_stmt() does: > > > > > > if (res && res->stmt == stmt) > > > return res; > > > > > > so it returns NULL if the stored stmt does not match the one that was > > > looked > > > up. > > > > Doh, I wonder what we added that for! > > > > > I could add a "bool relaxed = false" argument to lookup_stmt() instead, > > > or could add an alternate function lookup_stmt_relaxed() if you prefer? I > > > guess the second option would be pretty similar to what I did with > > > resync_stmt_addr(), just it would not actually make the change within the > > > function itself. > > > > No, in this case your patch is OK with an added comment (in your new > > function) > > why you can't use lookup_stmt. > > Btw, I tried to look into why loop_version would ever need more PHI capacity > temporarily - so I did the following but it never triggers during bootstrap or > regtest. That said - do you have a testcase that exercises this path? > > diff --git a/gcc/tree-vectorizer.cc b/gcc/tree-vectorizer.cc > index 92ecd881508..bc3c6254fdf 100644 > --- a/gcc/tree-vectorizer.cc > +++ b/gcc/tree-vectorizer.cc > @@ -556,6 +556,7 @@ vec_info::resync_stmt_addr (gimple *stmt) > stmt_vec_info res = stmt_vec_infos[uid - 1]; > if (res && res->stmt) > { > + gcc_assert (res->stmt == stmt); > res->stmt = stmt; > return res; > } > >
Sorry, I should have mentioned the relevant testcases in the commit message. Reverting this whole patch would lead to the following C testcases breaking on sparc 32-bit (I tested on cfarm216): gcc.dg/graphite/pr81945.c gcc.dg/torture/pr113707-2.c gcc.dg/vect/bb-slp-41.c gcc.dg/vect/pr53185.c gcc.dg/vect/pr58508.c gcc.dg/vect/pr81740-1.c gcc.dg/vect/vect-100.c gcc.dg/vect/vect-106.c gcc.dg/vect/vect-alias-check-1.c gcc.dg/vect/vect-alias-check-6.c gcc.dg/vect/vect-align-2.c If, instead of reverting the whole patch, you just add the gcc_assert that you mentioned in tree-vectorizer.cc, then you will see it go off on those vect/ cases but also on many more. It seems that on 64-bit platforms there is usually enough room for 4 args in a newly created PHI node, which is why this reallocation doesn't usually happen, at least for typical examples. On sparc, the way it works out with the size and padding etc, you can end up with room for just the minimum 2 PHI arguments, which then necessitates the reallocation inside loop_version(). resize_phi_nodes() is being called from reserve_phi_args_for_new_edge(), from gimple_execute_on_growing_pred(), from duplicate_loop_body_to_header_edge(). One way to observe this on 64-bit platforms is to be less aggressive about preallocating extra room for the PHI args. Right now it rounds up to the next power of 2, but it could round up instead to the next ggc bucket size. Here is a quick patch that does that: ===== diff --git a/gcc/tree-phinodes.cc b/gcc/tree-phinodes.cc index 5a7e4a94e57..e80611748aa 100644 --- a/gcc/tree-phinodes.cc +++ b/gcc/tree-phinodes.cc @@ -143,7 +143,7 @@ static int ideal_phi_node_len (int len) { size_t size, new_size; - int log2, new_len; + int new_len; /* We do not support allocations of less than two PHI argument slots. */ if (len < 2) @@ -153,9 +153,8 @@ ideal_phi_node_len (int len) size = sizeof (struct gphi) + (len - 1) * sizeof (struct phi_arg_d); - /* Round it up to the next power of two. */ - log2 = ceil_log2 (size); - new_size = 1 << log2; + /* Round it up to what ggc is actually going to allocate. */ + new_size = ggc_round_alloc_size (size); /* Now compute and return the number of PHI argument slots given an ideal size allocation. */ ===== I have not looked into that change much, but I could if you find it interesting (it definitely results in smaller allocations). I did verify it bootstraps + regtests cleanly for C++ on x86-64. With that change in place, then your gcc_assert in tree-vectorizer.cc would be triggering in many cases: FAIL: gcc.dg/vect/bb-slp-41.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/bb-slp-42.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/bb-slp-phis-1.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/no-trapping-math-1.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/pr109011-1.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/pr109011-2.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/pr109011-3.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/pr109011-4.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/pr109011-5.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/pr116352.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/pr117874.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/pr20122.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/pr23816-1.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/pr23816-2.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/pr24059.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/pr28952.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/pr37482.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/pr37539.c -flto -ffat-lto-objects (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/pr41956.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/pr49771.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/pr53185.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/pr56826.c -flto -ffat-lto-objects (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/pr56933.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/pr58508.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/pr62075.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/pr79887.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/pr80815-1.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/pr80815-2.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/pr82108.c -flto -ffat-lto-objects (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/pr84357.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/pr85586.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/pr85597.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/pr89268.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/pr92723.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/slp-45.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/tsvc/vect-tsvc-s1421.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/tsvc/vect-tsvc-s151.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/tsvc/vect-tsvc-s422.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/tsvc/vect-tsvc-s423.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/tsvc/vect-tsvc-s424.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-77-global.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-78-global.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-alias-check-1.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-alias-check-10.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-alias-check-11.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-alias-check-12.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-alias-check-14.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-alias-check-15.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-alias-check-16.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-alias-check-18.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-alias-check-19.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-alias-check-20.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-alias-check-6.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-alias-check-7.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-alias-check-8.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-alias-check-9.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-alias-check.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-cond-11.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-cost-model-1.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-early-break_124-pr114403.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-early-break_23.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-early-break_24.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-early-break_80.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-live-3.c -flto -ffat-lto-objects (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-live-4.c -flto -ffat-lto-objects (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-live-5.c -flto -ffat-lto-objects (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-live-slp-2.c -flto -ffat-lto-objects (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-over-widen-1-big-array.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-over-widen-1.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-over-widen-2-big-array.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-over-widen-2.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-over-widen-3-big-array.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-over-widen-3.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-over-widen-4-big-array.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-over-widen-4.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-simd-1.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-simd-2.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-strided-epilogue-1.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-vfa-01.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-vfa-02.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-vfa-03.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-vfa-04.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-vfa-slp.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-widen-shift-s8.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) FAIL: gcc.dg/vect/vect-widen-shift-u8.c (internal compiler error: in resync_stmt_addr, at tree-vectorizer.cc:559) since then the need for PHI reallocations is much more common. Hope that's helpful... -Lewis