SLP induction vectorization runs into the issue that it remembers pointers to PHI nodes in the SLP tree during analysis. But those may get invalidated by loop copying (for prologue/epilogue peeling or versioning) as the low-level CFG helper copy_bbs works in the way of copying individual BBs plus their outgoing edges but with old destinations and at the end re-directing the edges to the desired location. In SSA this triggers the whole machinery of making room for new PHI nodes -- that is undesirable because it causes re-allocation of PHI nodes in the set of source blocks.
After much pondering I arrived at the following (least ugly) solution to this "problem" (well, I define it as a problem, it's at least an inefficiency and a workaround in the vectorizer would be way uglier). Namely simply do not trigger the SSA machinery for blocks with BB_DUPLICATED (I skimmed all other users and they seem fine). In the process I also implemented some poisoning of the old PHI node when we reallocate (well, free) PHI nodes. But that triggers some other issues, one fixed by the tree-ssa-phionlycoprop.c hunk below. So I'm not submitting it as part of this fix. Bootstrapped (with the poisoning sofar, plain patch still running) on x86_64-unknown-linux-gnu, testing in progress. Comments welcome, testing won't finish before I leave for the weekend. Thanks, Richard. 2017-06-23 Richard Biener <rguent...@suse.de> * cfghooks.c (duplicate_block): Do not copy BB_DUPLICATED flag. (copy_bbs): Set BB_DUPLICATED flag early. (execute_on_growing_pred): Do not execute for BB_DUPLICATED marked blocks. (execute_on_shrinking_pred): Likewise. * tree-ssa.c (ssa_redirect_edge): Do not look for PHI args in BB_DUPLICATED blocks. * tree-ssa-phionlycoprop.c (eliminate_degenerate_phis_1): Properly iterate over all PHIs considering removal of *gsi. Index: gcc/cfghooks.c =================================================================== --- gcc/cfghooks.c (revision 249552) +++ gcc/cfghooks.c (working copy) @@ -1087,7 +1087,7 @@ duplicate_block (basic_block bb, edge e, if (after) move_block_after (new_bb, after); - new_bb->flags = bb->flags; + new_bb->flags = (bb->flags & ~BB_DUPLICATED); FOR_EACH_EDGE (s, ei, bb->succs) { /* Since we are creating edges from a new block to successors @@ -1207,7 +1207,8 @@ flow_call_edges_add (sbitmap blocks) void execute_on_growing_pred (edge e) { - if (cfg_hooks->execute_on_growing_pred) + if (! (e->dest->flags & BB_DUPLICATED) + && cfg_hooks->execute_on_growing_pred) cfg_hooks->execute_on_growing_pred (e); } @@ -1217,7 +1218,8 @@ execute_on_growing_pred (edge e) void execute_on_shrinking_pred (edge e) { - if (cfg_hooks->execute_on_shrinking_pred) + if (! (e->dest->flags & BB_DUPLICATED) + && cfg_hooks->execute_on_shrinking_pred) cfg_hooks->execute_on_shrinking_pred (e); } @@ -1353,6 +1355,12 @@ copy_bbs (basic_block *bbs, unsigned n, basic_block bb, new_bb, dom_bb; edge e; + /* Mark the blocks to be copied. This is used by edge creation hooks + to decide whether to reallocate PHI nodes capacity to avoid reallocating + PHIs in the set of source BBs. */ + for (i = 0; i < n; i++) + bbs[i]->flags |= BB_DUPLICATED; + /* Duplicate bbs, update dominators, assign bbs to loops. */ for (i = 0; i < n; i++) { @@ -1360,7 +1368,6 @@ copy_bbs (basic_block *bbs, unsigned n, bb = bbs[i]; new_bb = new_bbs[i] = duplicate_block (bb, NULL, after); after = new_bb; - bb->flags |= BB_DUPLICATED; if (bb->loop_father) { /* Possibly set loop header. */ Index: gcc/tree-ssa-phionlycprop.c =================================================================== --- gcc/tree-ssa-phionlycprop.c (revision 249552) +++ gcc/tree-ssa-phionlycprop.c (working copy) @@ -420,10 +420,11 @@ eliminate_degenerate_phis_1 (basic_block basic_block son; bool cfg_altered = false; - for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi);) { gphi *phi = gsi.phi (); - + /* We might end up removing PHI so advance the iterator now. */ + gsi_next (&gsi); cfg_altered |= eliminate_const_or_copy (phi, interesting_names, need_eh_cleanup); } Index: gcc/tree-ssa.c =================================================================== --- gcc/tree-ssa.c (revision 249552) +++ gcc/tree-ssa.c (working copy) @@ -142,21 +142,24 @@ ssa_redirect_edge (edge e, basic_block d redirect_edge_var_map_clear (e); - /* Remove the appropriate PHI arguments in E's destination block. */ - for (gsi = gsi_start_phis (e->dest); !gsi_end_p (gsi); gsi_next (&gsi)) - { - tree def; - source_location locus ; - - phi = gsi.phi (); - def = gimple_phi_arg_def (phi, e->dest_idx); - locus = gimple_phi_arg_location (phi, e->dest_idx); + /* Remove the appropriate PHI arguments in E's destination block. + If we are redirecting a copied edge the destination has not + got PHI argument space reserved nor an interesting argument. */ + if (! (e->dest->flags & BB_DUPLICATED)) + for (gsi = gsi_start_phis (e->dest); !gsi_end_p (gsi); gsi_next (&gsi)) + { + tree def; + source_location locus ; + + phi = gsi.phi (); + def = gimple_phi_arg_def (phi, e->dest_idx); + locus = gimple_phi_arg_location (phi, e->dest_idx); - if (def == NULL_TREE) - continue; + if (def == NULL_TREE) + continue; - redirect_edge_var_map_add (e, gimple_phi_result (phi), def, locus); - } + redirect_edge_var_map_add (e, gimple_phi_result (phi), def, locus); + } e = redirect_edge_succ_nodup (e, dest);