When I wrote the improved support for threading across backedges I tried to minimize the cost to invalidate equivalences. This led to some convoluted code to track things which might need invalidation (at PHI nodes), then further hacks to invalidate equivalences implied by traversal of particular edges, etc.

This bug is another example of an edge equivalence that needs to be invalidated. And like other edge equivalences that need invalidation, there's no chance to track it as the equivalency is created outside the threading code.

Rather than layer another hack on the existing hacks, I just ripped out the hacks and did the more expensive equivalency invalidation. In reality we're only invalidating after following a backedge in the CFG and only if we encounter statements that doesn't produce something useful. So it shouldn't be all that expensive.

Note the tests may look the same, but they're subtly different in that they have different orderings of arguments in an equality test, which is important for this testcase.

Bootstrapped and regression tested on x86_64-unknown-linux-gnu and applied to the trunk. Will backport to 4.9 after it bakes a bit on the trunk.


Jeff
commit 1afaa5b951f3a5dae5d6c2355c1457c7d175e1c9
Author: Jeff Law <l...@redhat.com>
Date:   Thu Jun 5 12:20:42 2014 -0600

        PR tree-optimization/61289
        * tree-ssa-threadedge.c (invalidate_equivalences): Remove SRC_MAP and
        DST_MAP parameters.   Invalidate by walking all the SSA_NAME_VALUES
        looking for those which match LHS.  All callers changed.
        (record_temporary_equivalences_from_phis): Remove SRC_MAP and DST_MAP
        parameters and code which manipulated them.  All callers changed.
        (record_temporary_equivalences_from_stmts_at_dest): Remove SRC_MAP
        and DST_MAP parameters.  Simplify invalidation code by just calling
        invalidate_equivalences.  All callers changed.
        (thread_across_edge): Simplify now that we don't need to maintain
        the map of equivalences to invalidate.
    
            PR tree-optimization/61289
        * g++.dg/pr61289.C: New test.
        * g++.dg/pr61289-2.C: New test.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 4d88dd2..94a30d4 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,17 @@
+2014-06-05  Jeff Law  <l...@redhat.com>
+
+       PR tree-optimization/61289
+       * tree-ssa-threadedge.c (invalidate_equivalences): Remove SRC_MAP and
+       DST_MAP parameters.   Invalidate by walking all the SSA_NAME_VALUES
+       looking for those which match LHS.  All callers changed.
+       (record_temporary_equivalences_from_phis): Remove SRC_MAP and DST_MAP
+       parameters and code which manipulated them.  All callers changed.
+       (record_temporary_equivalences_from_stmts_at_dest): Remove SRC_MAP
+       and DST_MAP parameters.  Simplify invalidation code by just calling
+       invalidate_equivalences.  All callers changed.
+       (thread_across_edge): Simplify now that we don't need to maintain
+       the map of equivalences to invalidate.
+
 2014-06-05  Kai Tietz  <kti...@redhat.com>
            Richard Henderson  <r...@redhat.com>
 
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 54a4026..5fb5103 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,9 @@
+2014-06-05  Jeff Law  <l...@redhat.com>
+
+       PR tree-optimization/61289
+       * g++.dg/pr61289.C: New test.
+       * g++.dg/pr61289-2.C: New test.
+
 2014-06-05  Richard Biener  <rguent...@suse.de>
            Paolo Carlini  <paolo.carl...@oracle.com>
 
diff --git a/gcc/testsuite/g++.dg/pr61289-2.c b/gcc/testsuite/g++.dg/pr61289-2.c
new file mode 100644
index 0000000..4cc3ebe
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr61289-2.c
@@ -0,0 +1,62 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -fno-exceptions" } */
+struct S
+{
+  inline int fn1 () const { return s; }
+  __attribute__ ((noinline, noclone)) S *fn2 (int);
+  __attribute__ ((noinline, noclone)) void fn3 ();
+  __attribute__ ((noinline, noclone)) static S *fn4 (int);
+  S (int i) : s (i) {}
+  int s;
+};
+
+int a = 0;
+S *b = 0;
+
+S *
+S::fn2 (int i)
+{
+  a++;
+  if (a == 1)
+    return b;
+  if (a > 3)
+    __builtin_abort ();
+  b = this;
+  return new S (i + s);
+}
+
+S *
+S::fn4 (int i)
+{
+  b = new S (i);
+  return b;
+}
+
+void
+S::fn3 ()
+{
+  delete this;
+}
+
+void
+foo ()
+{
+  S *c = S::fn4 (20);
+  for (int i = 0; i < 2;)
+    {
+      S *d = c->fn2 (c->fn1 () + 10);
+      if (c != d)
+{
+  c->fn3 ();
+  c = d;
+  ++i;
+}
+    }
+  c->fn3 ();
+}
+
+int
+main ()
+{
+  foo ();
+}
diff --git a/gcc/testsuite/g++.dg/pr61289.C b/gcc/testsuite/g++.dg/pr61289.C
new file mode 100644
index 0000000..ea7ccea
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr61289.C
@@ -0,0 +1,63 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -fno-exceptions" } */
+
+struct S
+{
+  inline int fn1 () const { return s; }
+  __attribute__ ((noinline, noclone)) S *fn2 (int);
+  __attribute__ ((noinline, noclone)) void fn3 ();
+  __attribute__ ((noinline, noclone)) static S *fn4 (int);
+  S (int i) : s (i) {}
+  int s;
+};
+
+int a = 0;
+S *b = 0;
+
+S *
+S::fn2 (int i)
+{
+  a++;
+  if (a == 1)
+    return b;
+  if (a > 3)
+    __builtin_abort ();
+  b = this;
+  return new S (i + s);
+}
+
+S *
+S::fn4 (int i)
+{
+  b = new S (i);
+  return b;
+}
+
+void
+S::fn3 ()
+{
+  delete this;
+}
+
+void
+foo ()
+{
+  S *c = S::fn4 (20);
+  for (int i = 0; i < 2;)
+    {
+      S *d = c->fn2 (c->fn1 () + 10);
+      if (d != c)
+{
+  c->fn3 ();
+  c = d;
+  ++i;
+}
+    }
+  c->fn3 ();
+}
+
+int
+main ()
+{
+  foo ();
+}
diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
index 69e5a6b..ba9e1fe 100644
--- a/gcc/tree-ssa-threadedge.c
+++ b/gcc/tree-ssa-threadedge.c
@@ -200,9 +200,7 @@ record_temporary_equivalence (tree x, tree y, vec<tree> 
*stack)
    traversing back edges less painful.  */
 
 static bool
-record_temporary_equivalences_from_phis (edge e, vec<tree> *stack,
-                                        bool backedge_seen,
-                                        bitmap src_map, bitmap dst_map)
+record_temporary_equivalences_from_phis (edge e, vec<tree> *stack)
 {
   gimple_stmt_iterator gsi;
 
@@ -230,14 +228,6 @@ record_temporary_equivalences_from_phis (edge e, vec<tree> 
*stack,
        stmt_count++;
 
       record_temporary_equivalence (dst, src, stack);
-
-      /* If we have crossed a backedge, then start recording equivalences
-        we might need to invalidate.  */
-      if (backedge_seen && TREE_CODE (src) == SSA_NAME)
-       {
-         bitmap_set_bit (src_map, SSA_NAME_VERSION (src));
-         bitmap_set_bit (dst_map, SSA_NAME_VERSION (dst));
-       }
     }
   return true;
 }
@@ -295,29 +285,15 @@ fold_assignment_stmt (gimple stmt)
 /* A new value has been assigned to LHS.  If necessary, invalidate any
    equivalences that are no longer valid.  */
 static void
-invalidate_equivalences (tree lhs, vec<tree> *stack,
-                        bitmap src_map, bitmap dst_map)
+invalidate_equivalences (tree lhs, vec<tree> *stack)
 {
-  /* SRC_MAP contains the source SSA_NAMEs for equivalences created by PHI
-     nodes.  If an entry in SRC_MAP changes, there's some destination that
-     has been recorded as equivalent to the source and that equivalency
-     needs to be eliminated.  */
-  if (bitmap_bit_p (src_map, SSA_NAME_VERSION (lhs)))
-    {
-      unsigned int i;
-      bitmap_iterator bi;
-
-      /* We know that the LHS of STMT was used as the RHS in an equivalency
-        created by a PHI.  All the LHS of such PHIs were recorded into DST_MAP.
-        So we can iterate over them to see if any have the LHS of STMT as
-        an equivalence, and if so, remove the equivalence as it is no longer
-        valid.  */
-      EXECUTE_IF_SET_IN_BITMAP (dst_map, 0, i, bi)
-       {
-         if (SSA_NAME_VALUE (ssa_name (i)) == lhs)
-           record_temporary_equivalence (ssa_name (i), NULL_TREE, stack);
-       }
-    }
+
+  for (unsigned int i = 1; i < num_ssa_names; i++)
+    if (ssa_name (i) && SSA_NAME_VALUE (ssa_name (i)) == lhs)
+      record_temporary_equivalence (ssa_name (i), NULL_TREE, stack);
+
+  if (SSA_NAME_VALUE (lhs))
+    record_temporary_equivalence (lhs, NULL_TREE, stack);
 }
 
 /* Try to simplify each statement in E->dest, ultimately leading to
@@ -342,9 +318,7 @@ record_temporary_equivalences_from_stmts_at_dest (edge e,
                                                  vec<tree> *stack,
                                                  tree (*simplify) (gimple,
                                                                    gimple),
-                                                 bool backedge_seen,
-                                                 bitmap src_map,
-                                                 bitmap dst_map)
+                                                 bool backedge_seen)
 {
   gimple stmt = NULL;
   gimple_stmt_iterator gsi;
@@ -400,19 +374,7 @@ record_temporary_equivalences_from_stmts_at_dest (edge e,
 
          if (backedge_seen)
            FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_DEF)
-             {
-               /* This call only invalidates equivalences created by
-                  PHI nodes.  This is by design to keep the cost of
-                  of invalidation reasonable.  */
-               invalidate_equivalences (op, stack, src_map, dst_map);
-
-               /* However, conditionals can imply values for real
-                  operands as well.  And those won't be recorded in the
-                  maps.  In fact, those equivalences may be recorded totally
-                  outside the threading code.  We can just create a new
-                  temporary NULL equivalence here.  */
-               record_temporary_equivalence (op, NULL_TREE, stack);
-             }
+             invalidate_equivalences (op, stack);
 
          continue;
        }
@@ -452,8 +414,7 @@ record_temporary_equivalences_from_stmts_at_dest (edge e,
              if (backedge_seen)
                {
                  tree lhs = gimple_get_lhs (stmt);
-                 record_temporary_equivalence (lhs, NULL_TREE, stack);
-                 invalidate_equivalences (lhs, stack, src_map, dst_map);
+                 invalidate_equivalences (lhs, stack);
                }
              continue;
            }
@@ -531,11 +492,7 @@ record_temporary_equivalences_from_stmts_at_dest (edge e,
              || is_gimple_min_invariant (cached_lhs)))
        record_temporary_equivalence (gimple_get_lhs (stmt), cached_lhs, stack);
       else if (backedge_seen)
-       record_temporary_equivalence (gimple_get_lhs (stmt), NULL_TREE, stack);
-
-      if (backedge_seen)
-       invalidate_equivalences (gimple_get_lhs (stmt), stack,
-                                src_map, dst_map);
+       invalidate_equivalences (gimple_get_lhs (stmt), stack);
     }
   return stmt;
 }
@@ -982,9 +939,7 @@ thread_through_normal_block (edge e,
                             tree (*simplify) (gimple, gimple),
                             vec<jump_thread_edge *> *path,
                             bitmap visited,
-                            bool *backedge_seen_p,
-                            bitmap src_map,
-                            bitmap dst_map)
+                            bool *backedge_seen_p)
 {
   /* If we have traversed a backedge, then we do not want to look
      at certain expressions in the table that can not be relied upon.
@@ -994,16 +949,14 @@ thread_through_normal_block (edge e,
     simplify = dummy_simplify;
 
   /* PHIs create temporary equivalences.  */
-  if (!record_temporary_equivalences_from_phis (e, stack, *backedge_seen_p,
-                                               src_map, dst_map))
+  if (!record_temporary_equivalences_from_phis (e, stack))
     return 0;
 
   /* Now walk each statement recording any context sensitive
      temporary equivalences we can detect.  */
   gimple stmt
     = record_temporary_equivalences_from_stmts_at_dest (e, stack, simplify,
-                                                       *backedge_seen_p,
-                                                       src_map, dst_map);
+                                                       *backedge_seen_p);
 
   /* If we didn't look at all the statements, the most likely reason is
      there were too many and thus duplicating this block is not profitable.
@@ -1112,8 +1065,6 @@ thread_across_edge (gimple dummy_cond,
                    tree (*simplify) (gimple, gimple))
 {
   bitmap visited = BITMAP_ALLOC (NULL);
-  bitmap src_map = BITMAP_ALLOC (NULL);
-  bitmap dst_map = BITMAP_ALLOC (NULL);
   bool backedge_seen;
 
   stmt_count = 0;
@@ -1129,16 +1080,13 @@ thread_across_edge (gimple dummy_cond,
   int threaded = thread_through_normal_block (e, dummy_cond,
                                              handle_dominating_asserts,
                                              stack, simplify, path,
-                                             visited, &backedge_seen,
-                                             src_map, dst_map);
+                                             visited, &backedge_seen);
   if (threaded > 0)
     {
       propagate_threaded_block_debug_into (path->last ()->e->dest,
                                           e->dest);
       remove_temporary_equivalences (stack);
       BITMAP_FREE (visited);
-      BITMAP_FREE (src_map);
-      BITMAP_FREE (dst_map);
       register_jump_thread (path);
       return;
     }
@@ -1160,8 +1108,6 @@ thread_across_edge (gimple dummy_cond,
       if (threaded < 0)
        {
          BITMAP_FREE (visited);
-         BITMAP_FREE (src_map);
-         BITMAP_FREE (dst_map);
          remove_temporary_equivalences (stack);
          return;
        }
@@ -1190,26 +1136,15 @@ thread_across_edge (gimple dummy_cond,
        {
          remove_temporary_equivalences (stack);
          BITMAP_FREE (visited);
-         BITMAP_FREE (src_map);
-         BITMAP_FREE (dst_map);
          return;
        }
 
-    /* We need to restore the state of the maps to this point each loop
-       iteration.  */
-    bitmap src_map_copy = BITMAP_ALLOC (NULL);
-    bitmap dst_map_copy = BITMAP_ALLOC (NULL);
-    bitmap_copy (src_map_copy, src_map);
-    bitmap_copy (dst_map_copy, dst_map);
-
     /* Look at each successor of E->dest to see if we can thread through it.  
*/
     FOR_EACH_EDGE (taken_edge, ei, e->dest->succs)
       {
        /* Push a fresh marker so we can unwind the equivalences created
           for each of E->dest's successors.  */
        stack->safe_push (NULL_TREE);
-       bitmap_copy (src_map, src_map_copy);
-       bitmap_copy (dst_map, dst_map_copy);
      
        /* Avoid threading to any block we have already visited.  */
        bitmap_clear (visited);
@@ -1245,8 +1180,7 @@ thread_across_edge (gimple dummy_cond,
          found = thread_through_normal_block (path->last ()->e, dummy_cond,
                                               handle_dominating_asserts,
                                               stack, simplify, path, visited,
-                                              &backedge_seen,
-                                              src_map, dst_map) > 0;
+                                              &backedge_seen) > 0;
 
        /* If we were able to thread through a successor of E->dest, then
           record the jump threading opportunity.  */
@@ -1265,10 +1199,6 @@ thread_across_edge (gimple dummy_cond,
        remove_temporary_equivalences (stack);
       }
     BITMAP_FREE (visited);
-    BITMAP_FREE (src_map);
-    BITMAP_FREE (dst_map);
-    BITMAP_FREE (src_map_copy);
-    BITMAP_FREE (dst_map_copy);
   }
 
   remove_temporary_equivalences (stack);

Reply via email to