https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79088
--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> --- But note the threading in the testcase shouldn't by itself affect the assumptions of loop 3 looking at the path and the shape of the CFG at the point we do this analysis. So I think the logic we use is flawed because it doesn't reflect what the real issue is. The real issue is that we thread a loop entry of an outer loop to a loop header of one of its children which will ultimatively cause the former inner loop entry from the outer loop to become another latch of that inner loop. So it's the threading final destination being a loop header and crossing a loop entry edge of an outer loop. The following captures that (and fixes the testcase), also moved the cache wiping after the other code that might still end up wiping the threading path instead. Index: gcc/tree-ssa-threadupdate.c =================================================================== --- gcc/tree-ssa-threadupdate.c (revision 244523) +++ gcc/tree-ssa-threadupdate.c (working copy) @@ -2082,42 +2082,6 @@ mark_threaded_blocks (bitmap threaded_bl else bitmap_copy (threaded_blocks, tmp); - /* Look for jump threading paths which cross multiple loop headers. - - The code to thread through loop headers will change the CFG in ways - that invalidate the cached loop iteration information. So we must - detect that case and wipe the cached information. */ - EXECUTE_IF_SET_IN_BITMAP (tmp, 0, i, bi) - { - basic_block bb = BASIC_BLOCK_FOR_FN (cfun, i); - FOR_EACH_EDGE (e, ei, bb->preds) - { - if (e->aux) - { - vec<jump_thread_edge *> *path = THREAD_PATH (e); - - for (unsigned int i = 0, crossed_headers = 0; - i < path->length (); - i++) - { - basic_block dest = (*path)[i]->e->dest; - basic_block src = (*path)[i]->e->src; - crossed_headers += (dest == dest->loop_father->header); - /* If we step from a block outside an irreducible region - to a block inside an irreducible region, then we have - crossed into a loop. */ - crossed_headers += ((src->flags & BB_IRREDUCIBLE_LOOP) - != (dest->flags & BB_IRREDUCIBLE_LOOP)); - if (crossed_headers > 1) - { - vect_free_loop_info_assumptions (dest->loop_father); - break; - } - } - } - } - } - /* If we have a joiner block (J) which has two successors S1 and S2 and we are threading though S1 and the final destination of the thread is S2, then we must verify that any PHI nodes in S2 have the same @@ -2161,6 +2125,40 @@ mark_threaded_blocks (bitmap threaded_bl } } } + + /* Look for jump threading paths which thread to a loop header crossing + loop entry edges. + + The code to thread through loop headers will change the CFG to add + new latches to the destination loop and thus invalidate the cached + loop iteration information. So we must detect that case and wipe the + cached information. */ + EXECUTE_IF_SET_IN_BITMAP (tmp, 0, i, bi) + { + basic_block bb = BASIC_BLOCK_FOR_FN (cfun, i); + FOR_EACH_EDGE (e, ei, bb->preds) + { + if (e->aux) + { + vec<jump_thread_edge *> *path = THREAD_PATH (e); + + basic_block path_end = (*path)[path->length () - 1]->e->dest; + if (path_end == path_end->loop_father->header) + for (unsigned int i = 0; i < path->length () - 1; i++) + { + basic_block dest = (*path)[i]->e->dest; + basic_block src = (*path)[i]->e->src; + if (dest == dest->loop_father->header + && flow_loop_nested_p (src->loop_father, + dest->loop_father)) + { + vect_free_loop_info_assumptions (path_end->loop_father); + break; + } + } + } + } + } BITMAP_FREE (tmp); }