https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120003
--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> --- (In reply to Andrew Macleod from comment #4) > This seems to be the issue? > > <bb 4> [local count: 350791453]: > _1 = g (i_3); > if (_1 != 0) > goto <bb 5>; [50.00%] > else > goto <bb 6>; [50.00%] > > <bb 5> [local count: 175395727]: > > <bb 6> [local count: 1063004408]: > # iftmp.0_4 = PHI <1(3), 0(4), 1(5)> > > That 3 way PHI isn't used in any threads, so we don't get a threaded path > to the eventual return of 1. The irreducible check is at least badly named - as written it does not make the containing loop irreducible, instead it partly unrolls things. But with that fixed we still reject the path in jt_path_registry::cancel_invalid_paths by 2840 cancel_thread (&path, "Path crosses loop header but does not exit it"); which is true again. We can allow another subset of threads, but this then enables the path: 9->6->7->3->6 path which just duplicates one iteration which does not help. We need to create a subloop or sibling loop w/o the call. I don't see offhand why this doesn't work - but then isolating a path will never create a new loop(?) I've played with the following. diff --git a/gcc/tree-ssa-threadbackward.cc b/gcc/tree-ssa-threadbackward.cc index 23bfc14c8f0..2603d27f1f3 100644 --- a/gcc/tree-ssa-threadbackward.cc +++ b/gcc/tree-ssa-threadbackward.cc @@ -789,6 +789,7 @@ back_threader_profitability::profitable_path_p (const vec<basic_block> &m_path, *creates_irreducible_loop = false; if (m_threaded_through_latch && loop == taken_edge->dest->loop_father + && taken_edge->dest != m_path[m_path.length () - 2] && (determine_bb_domination_status (loop, taken_edge->dest) == DOMST_NONDOMINATING)) *creates_irreducible_loop = true; diff --git a/gcc/tree-ssa-threadupdate.cc b/gcc/tree-ssa-threadupdate.cc index 4e5c7566857..d91c0c7bf20 100644 --- a/gcc/tree-ssa-threadupdate.cc +++ b/gcc/tree-ssa-threadupdate.cc @@ -2811,6 +2811,10 @@ jt_path_registry::cancel_invalid_paths (vec<jump_thread_edge *> &path) && flow_loop_nested_p (exit->dest->loop_father, exit->src->loop_father)) return false; + // If we thread a whole loop round-trip, we are just creating a subloop + if (entry->dest == exit->dest) + return false; + if (cfun->curr_properties & PROP_loop_opts_done) return false;