SLP induction vectorization runs into the issue that it remembers
pointers to PHI nodes in the SLP tree during analysis.  But those
may get invalidated by loop copying (for prologue/epilogue peeling
or versioning) as the low-level CFG helper copy_bbs works in the
way of copying individual BBs plus their outgoing edges but with
old destinations and at the end re-directing the edges to the
desired location.  In SSA this triggers the whole machinery of
making room for new PHI nodes -- that is undesirable because it
causes re-allocation of PHI nodes in the set of source blocks.

After much pondering I arrived at the following (least ugly) solution
to this "problem" (well, I define it as a problem, it's at least
an inefficiency and a workaround in the vectorizer would be way
uglier).  Namely simply do not trigger the SSA machinery for
blocks with BB_DUPLICATED (I skimmed all other users and they seem
fine).

In the process I also implemented some poisoning of the old PHI node
when we reallocate (well, free) PHI nodes.  But that triggers some
other issues, one fixed by the tree-ssa-phionlycoprop.c hunk below.
So I'm not submitting it as part of this fix.

Bootstrapped (with the poisoning sofar, plain patch still running)
on x86_64-unknown-linux-gnu, testing in progress.

Comments welcome, testing won't finish before I leave for the
weekend.

Thanks,
Richard.

2017-06-23  Richard Biener  <rguent...@suse.de>

        * cfghooks.c (duplicate_block): Do not copy BB_DUPLICATED flag.
        (copy_bbs): Set BB_DUPLICATED flag early.
        (execute_on_growing_pred): Do not execute for BB_DUPLICATED
        marked blocks.
        (execute_on_shrinking_pred): Likewise.
        * tree-ssa.c (ssa_redirect_edge): Do not look for PHI args in
        BB_DUPLICATED blocks.
        * tree-ssa-phionlycoprop.c (eliminate_degenerate_phis_1): Properly
        iterate over all PHIs considering removal of *gsi.

Index: gcc/cfghooks.c
===================================================================
--- gcc/cfghooks.c      (revision 249552)
+++ gcc/cfghooks.c      (working copy)
@@ -1087,7 +1087,7 @@ duplicate_block (basic_block bb, edge e,
   if (after)
     move_block_after (new_bb, after);
 
-  new_bb->flags = bb->flags;
+  new_bb->flags = (bb->flags & ~BB_DUPLICATED);
   FOR_EACH_EDGE (s, ei, bb->succs)
     {
       /* Since we are creating edges from a new block to successors
@@ -1207,7 +1207,8 @@ flow_call_edges_add (sbitmap blocks)
 void
 execute_on_growing_pred (edge e)
 {
-  if (cfg_hooks->execute_on_growing_pred)
+  if (! (e->dest->flags & BB_DUPLICATED)
+      && cfg_hooks->execute_on_growing_pred)
     cfg_hooks->execute_on_growing_pred (e);
 }
 
@@ -1217,7 +1218,8 @@ execute_on_growing_pred (edge e)
 void
 execute_on_shrinking_pred (edge e)
 {
-  if (cfg_hooks->execute_on_shrinking_pred)
+  if (! (e->dest->flags & BB_DUPLICATED)
+      && cfg_hooks->execute_on_shrinking_pred)
     cfg_hooks->execute_on_shrinking_pred (e);
 }
 
@@ -1353,6 +1355,12 @@ copy_bbs (basic_block *bbs, unsigned n,
   basic_block bb, new_bb, dom_bb;
   edge e;
 
+  /* Mark the blocks to be copied.  This is used by edge creation hooks
+     to decide whether to reallocate PHI nodes capacity to avoid reallocating
+     PHIs in the set of source BBs.  */
+  for (i = 0; i < n; i++)
+    bbs[i]->flags |= BB_DUPLICATED;
+
   /* Duplicate bbs, update dominators, assign bbs to loops.  */
   for (i = 0; i < n; i++)
     {
@@ -1360,7 +1368,6 @@ copy_bbs (basic_block *bbs, unsigned n,
       bb = bbs[i];
       new_bb = new_bbs[i] = duplicate_block (bb, NULL, after);
       after = new_bb;
-      bb->flags |= BB_DUPLICATED;
       if (bb->loop_father)
        {
          /* Possibly set loop header.  */
Index: gcc/tree-ssa-phionlycprop.c
===================================================================
--- gcc/tree-ssa-phionlycprop.c (revision 249552)
+++ gcc/tree-ssa-phionlycprop.c (working copy)
@@ -420,10 +420,11 @@ eliminate_degenerate_phis_1 (basic_block
   basic_block son;
   bool cfg_altered = false;
 
-  for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+  for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi);)
     {
       gphi *phi = gsi.phi ();
-
+      /* We might end up removing PHI so advance the iterator now.  */
+      gsi_next (&gsi);
       cfg_altered |= eliminate_const_or_copy (phi, interesting_names,
                                              need_eh_cleanup);
     }

Index: gcc/tree-ssa.c
===================================================================
--- gcc/tree-ssa.c      (revision 249552)
+++ gcc/tree-ssa.c      (working copy)
@@ -142,21 +142,24 @@ ssa_redirect_edge (edge e, basic_block d
 
   redirect_edge_var_map_clear (e);
 
-  /* Remove the appropriate PHI arguments in E's destination block.  */
-  for (gsi = gsi_start_phis (e->dest); !gsi_end_p (gsi); gsi_next (&gsi))
-    {
-      tree def;
-      source_location locus ;
-
-      phi = gsi.phi ();
-      def = gimple_phi_arg_def (phi, e->dest_idx);
-      locus = gimple_phi_arg_location (phi, e->dest_idx);
+  /* Remove the appropriate PHI arguments in E's destination block.
+     If we are redirecting a copied edge the destination has not
+     got PHI argument space reserved nor an interesting argument.  */
+  if (! (e->dest->flags & BB_DUPLICATED))
+    for (gsi = gsi_start_phis (e->dest); !gsi_end_p (gsi); gsi_next (&gsi))
+      {
+       tree def;
+       source_location locus ;
+
+       phi = gsi.phi ();
+       def = gimple_phi_arg_def (phi, e->dest_idx);
+       locus = gimple_phi_arg_location (phi, e->dest_idx);
 
-      if (def == NULL_TREE)
-       continue;
+       if (def == NULL_TREE)
+         continue;
 
-      redirect_edge_var_map_add (e, gimple_phi_result (phi), def, locus);
-    }
+       redirect_edge_var_map_add (e, gimple_phi_result (phi), def, locus);
+      }
 
   e = redirect_edge_succ_nodup (e, dest);
 

Reply via email to