It's amazing what 10 years away from a hunk of code will do...
So we've long had an arbitrary restriction in the jump thread
identification code which made it very conservative when searching for
threading opportunities once it encountered a back edge.
I've always known the problem was pollution of the tables with
equivalences that were not valid once we cross the backedge. But I'd
never *really* looked at the problem. I'd just had a sneaking suspicion
the problems were related to the equivalences we record for the
true/false arms of conditionals, PHIs and maybe other lurkers.
So for example, if we have a PHI
x2 = PHI (x0 (bb2), x1 (bb3));
[ ... ]
x1 = <whatever>
Where the incoming edge from bb3 is the backedge. We will record a
temporary equivalence x2 = x1. Once x1 gets reassigned, we can no
longer use that equivalence.
So we record in bitmaps the PHI arg of any PHI nodes that are used to
create equivalences and the LHS of PHI nodes into another bitmap. We
only do this after traversing a backedge.
If we see an assignment to anything recorded in the bitmap for the PHI
arg names, then we iterate over the recorded LHS names to see if any are
currently equivalent to the RHS that just changed. Obviously if such a
situation is found, we eliminate the equivalence. This is probably
somewhat imprecise, but it's good enough.
We also have problems with expressions in the hash tables. These are
pretty trivial to solve -- the hash tables are only used by the SIMPLIFY
callback. Thus, we can just use a dummy SIMPLIFY callback once we've
passed a backedge in the CFG.
Finally, for assignments that set a SSA_NAME to a value that isn't
particularly useful, we need to record *something* in the equivalence
tables (a NULL_TREE is good). That effectively invalidates any
equivalences with that name. This is useful if we had used a
conditional to derive a single point range for an SSA_NAME but the
SSA_NAME's assignment statement doesn't provide such range precision.
This patch has bootstrapped and tested in various forms over the last
couple weeks. However, I made some simplifications late tonight and
thus it needs to go through another spin.
Bootstrapped, regression test is in progress. Will commit in the AM if
everything is clean and a final once-over doesn't cause something to
jump out as wrong.
Onwards to stage3 bugfixing :-)
Jeff
* tree-ssa-threadedge.c (record_temporary_equivalence): Handle
NULL for RHS, which we used to invalidate equivalences.
(record_temporary_equivalences_from_phis): New bitmap arguments
and a boolean indicating if we have passed a backedge. If we
have passed a backedge, then set the appropriate bit in the
bitmaps for the SRC & DEST of PHIs creating equivalences.
(invalidate_equivalences, dummy_simplify): New functions.
(cond_arg_set_in_b): Remove.
(record_temporary_equivalences_from_stmts_at_dest): New bitmap
arguments and a boolean indicating if we have passed a backedge.
If we have passed a backedge, then perform invalidations as
needed.
(thread_around_empty_blocks): If we have seen a backedge, then
use the dummy simplify routine.
(thread_through_normal_block): Likewise. Pass bitmaps and
backedge status to children. Do not pessimize so much when
traversing backedges in the CFG.
(thread_across_edge): Manage the SRC_MAP/DST_MAP bitmaps.
If we have seen a backedge, then use the dummy simplify routine.
Do not pessimize so much when traversing backedges.
diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
index 7600d7b..ce161a8 100644
--- a/gcc/tree-ssa-threadedge.c
+++ b/gcc/tree-ssa-threadedge.c
@@ -170,7 +170,8 @@ record_temporary_equivalence (tree x, tree y, vec<tree>
*stack)
{
tree prev_x = SSA_NAME_VALUE (x);
- if (TREE_CODE (y) == SSA_NAME)
+ /* Y may be NULL if we are invalidating entries in the table. */
+ if (y && TREE_CODE (y) == SSA_NAME)
{
tree tmp = SSA_NAME_VALUE (y);
y = tmp ? tmp : y;
@@ -186,10 +187,16 @@ record_temporary_equivalence (tree x, tree y, vec<tree>
*stack)
edge E. Record unwind information for the equivalences onto STACK.
If a PHI which prevents threading is encountered, then return FALSE
- indicating we should not thread this edge, else return TRUE. */
+ indicating we should not thread this edge, else return TRUE.
+
+ If SRC_MAP/DST_MAP exist, then mark the source and destination SSA_NAMEs
+ of any equivalences recorded. We use this to make invalidation after
+ traversing back edges less painful. */
static bool
-record_temporary_equivalences_from_phis (edge e, vec<tree> *stack)
+record_temporary_equivalences_from_phis (edge e, vec<tree> *stack,
+ bool backedge_seen,
+ bitmap src_map, bitmap dst_map)
{
gimple_stmt_iterator gsi;
@@ -217,6 +224,14 @@ record_temporary_equivalences_from_phis (edge e, vec<tree>
*stack)
stmt_count++;
record_temporary_equivalence (dst, src, stack);
+
+ /* If we have crossed a backedge, then start recording equivalences
+ we might need to invalidate. */
+ if (backedge_seen && TREE_CODE (src) == SSA_NAME)
+ {
+ bitmap_set_bit (src_map, SSA_NAME_VERSION (src));
+ bitmap_set_bit (dst_map, SSA_NAME_VERSION (dst));
+ }
}
return true;
}
@@ -271,6 +286,34 @@ fold_assignment_stmt (gimple stmt)
}
}
+/* A new value has been assigned to LHS. If necessary, invalidate any
+ equivalences that are no longer valid. */
+static void
+invalidate_equivalences (tree lhs, vec<tree> *stack,
+ bitmap src_map, bitmap dst_map)
+{
+ /* SRC_MAP contains the source SSA_NAMEs for equivalences created by PHI
+ nodes. If an entry in SRC_MAP changes, there's some destination that
+ has been recorded as equivalent to the source and that equivalency
+ needs to be eliminated. */
+ if (bitmap_bit_p (src_map, SSA_NAME_VERSION (lhs)))
+ {
+ unsigned int i;
+ bitmap_iterator bi;
+
+ /* We know that the LHS of STMT was used as the RHS in an equivalency
+ created by a PHI. All the LHS of such PHIs were recorded into DST_MAP.
+ So we can iterate over them to see if any have the LHS of STMT as
+ an equivalence, and if so, remove the equivalence as it is no longer
+ valid. */
+ EXECUTE_IF_SET_IN_BITMAP (dst_map, 0, i, bi)
+ {
+ if (SSA_NAME_VALUE (ssa_name (i)) == lhs)
+ record_temporary_equivalence (ssa_name (i), NULL_TREE, stack);
+ }
+ }
+}
+
/* Try to simplify each statement in E->dest, ultimately leading to
a simplification of the COND_EXPR at the end of E->dest.
@@ -292,7 +335,10 @@ static gimple
record_temporary_equivalences_from_stmts_at_dest (edge e,
vec<tree> *stack,
tree (*simplify) (gimple,
- gimple))
+ gimple),
+ bool backedge_seen,
+ bitmap src_map,
+ bitmap dst_map)
{
gimple stmt = NULL;
gimple_stmt_iterator gsi;
@@ -369,7 +415,15 @@ record_temporary_equivalences_from_stmts_at_dest (edge e,
if (fndecl
&& (DECL_FUNCTION_CODE (fndecl) == BUILT_IN_OBJECT_SIZE
|| DECL_FUNCTION_CODE (fndecl) == BUILT_IN_CONSTANT_P))
- continue;
+ {
+ if (backedge_seen)
+ {
+ tree lhs = gimple_get_lhs (stmt);
+ record_temporary_equivalence (lhs, NULL_TREE, stack);
+ invalidate_equivalences (lhs, stack, src_map, dst_map);
+ }
+ continue;
+ }
}
/* At this point we have a statement which assigns an RHS to an
@@ -433,15 +487,36 @@ record_temporary_equivalences_from_stmts_at_dest (edge e,
}
/* Record the context sensitive equivalence if we were able
- to simplify this statement. */
+ to simplify this statement.
+
+ If we have traversed a backedge at some point during threading,
+ then always enter something here. Either a real equivalence,
+ or a NULL_TREE equivalence which is effectively invalidation of
+ prior equivalences. */
if (cached_lhs
&& (TREE_CODE (cached_lhs) == SSA_NAME
|| is_gimple_min_invariant (cached_lhs)))
record_temporary_equivalence (gimple_get_lhs (stmt), cached_lhs, stack);
+ else if (backedge_seen)
+ record_temporary_equivalence (gimple_get_lhs (stmt), NULL_TREE, stack);
+
+ if (backedge_seen)
+ invalidate_equivalences (gimple_get_lhs (stmt), stack,
+ src_map, dst_map);
}
return stmt;
}
+/* Once we have passed a backedge in the CFG when threading, we do not want to
+ utilize edge equivalences for simplification purpose. They are no longer
+ necessarily valid. We use this callback rather than the ones provided by
+ DOM/VRP to achieve that effect. */
+static tree
+dummy_simplify (gimple stmt1 ATTRIBUTE_UNUSED, gimple stmt2 ATTRIBUTE_UNUSED)
+{
+ return NULL_TREE;
+}
+
/* Simplify the control statement at the end of the block E->dest.
To avoid allocating memory unnecessarily, a scratch GIMPLE_COND
@@ -581,44 +656,6 @@ simplify_control_stmt_condition (edge e,
return cached_lhs;
}
-/* Return TRUE if the statement at the end of e->dest depends on
- the output of any statement in BB. Otherwise return FALSE.
-
- This is used when we are threading a backedge and need to ensure
- that temporary equivalences from BB do not affect the condition
- in e->dest. */
-
-static bool
-cond_arg_set_in_bb (edge e, basic_block bb)
-{
- ssa_op_iter iter;
- use_operand_p use_p;
- gimple last = last_stmt (e->dest);
-
- /* E->dest does not have to end with a control transferring
- instruction. This can occur when we try to extend a jump
- threading opportunity deeper into the CFG. In that case
- it is safe for this check to return false. */
- if (!last)
- return false;
-
- if (gimple_code (last) != GIMPLE_COND
- && gimple_code (last) != GIMPLE_GOTO
- && gimple_code (last) != GIMPLE_SWITCH)
- return false;
-
- FOR_EACH_SSA_USE_OPERAND (use_p, last, iter, SSA_OP_USE | SSA_OP_VUSE)
- {
- tree use = USE_FROM_PTR (use_p);
-
- if (TREE_CODE (use) == SSA_NAME
- && gimple_code (SSA_NAME_DEF_STMT (use)) != GIMPLE_PHI
- && gimple_bb (SSA_NAME_DEF_STMT (use)) == bb)
- return true;
- }
- return false;
-}
-
/* Copy debug stmts from DEST's chain of single predecessors up to
SRC, so that we don't lose the bindings as PHI nodes are introduced
when DEST gains new predecessors. */
@@ -805,6 +842,8 @@ thread_around_empty_blocks (edge taken_edge,
path->safe_push (x);
bitmap_set_bit (visited, taken_edge->dest->index);
*backedge_seen_p |= ((taken_edge->flags & EDGE_DFS_BACK) != 0);
+ if (*backedge_seen_p)
+ simplify = dummy_simplify;
return thread_around_empty_blocks (taken_edge,
dummy_cond,
handle_dominating_asserts,
@@ -827,6 +866,13 @@ thread_around_empty_blocks (edge taken_edge,
&& gimple_code (stmt) != GIMPLE_SWITCH)
return false;
+ /* If we have traversed a backedge, then we do not want to look
+ at certain expressions in the table that can not be relied upon.
+ Luckily the only code that looked at those expressions is the
+ SIMPLIFY callback, which we replace if we can no longer use it. */
+ if (*backedge_seen_p)
+ simplify = dummy_simplify;
+
/* Extract and simplify the condition. */
cond = simplify_control_stmt_condition (taken_edge, stmt, dummy_cond,
simplify, handle_dominating_asserts);
@@ -846,6 +892,8 @@ thread_around_empty_blocks (edge taken_edge,
= new jump_thread_edge (taken_edge, EDGE_NO_COPY_SRC_BLOCK);
path->safe_push (x);
*backedge_seen_p |= ((taken_edge->flags & EDGE_DFS_BACK) != 0);
+ if (*backedge_seen_p)
+ simplify = dummy_simplify;
thread_around_empty_blocks (taken_edge,
dummy_cond,
@@ -896,24 +944,28 @@ thread_through_normal_block (edge e,
tree (*simplify) (gimple, gimple),
vec<jump_thread_edge *> *path,
bitmap visited,
- bool *backedge_seen_p)
+ bool *backedge_seen_p,
+ bitmap src_map,
+ bitmap dst_map)
{
- /* If we have crossed a backedge, then we want to verify that the COND_EXPR,
- SWITCH_EXPR or GOTO_EXPR at the end of e->dest is not affected
- by any statements in e->dest. If it is affected, then it is not
- safe to thread this edge. */
- if (*backedge_seen_p
- && cond_arg_set_in_bb (e, e->dest))
- return false;
+ /* If we have traversed a backedge, then we do not want to look
+ at certain expressions in the table that can not be relied upon.
+ Luckily the only code that looked at those expressions is the
+ SIMPLIFY callback, which we replace if we can no longer use it. */
+ if (*backedge_seen_p)
+ simplify = dummy_simplify;
/* PHIs create temporary equivalences. */
- if (!record_temporary_equivalences_from_phis (e, stack))
+ if (!record_temporary_equivalences_from_phis (e, stack, *backedge_seen_p,
+ src_map, dst_map))
return false;
/* Now walk each statement recording any context sensitive
temporary equivalences we can detect. */
gimple stmt
- = record_temporary_equivalences_from_stmts_at_dest (e, stack, simplify);
+ = record_temporary_equivalences_from_stmts_at_dest (e, stack, simplify,
+ *backedge_seen_p,
+ src_map, dst_map);
if (!stmt)
return false;
@@ -955,25 +1007,24 @@ thread_through_normal_block (edge e,
= new jump_thread_edge (taken_edge, EDGE_COPY_SRC_BLOCK);
path->safe_push (x);
*backedge_seen_p |= ((taken_edge->flags & EDGE_DFS_BACK) != 0);
+ if (*backedge_seen_p)
+ simplify = dummy_simplify;
/* See if we can thread through DEST as well, this helps capture
secondary effects of threading without having to re-run DOM or
- VRP. */
- if (!*backedge_seen_p
- || ! cond_arg_set_in_bb (taken_edge, e->dest))
- {
- /* We don't want to thread back to a block we have already
- visited. This may be overly conservative. */
- bitmap_set_bit (visited, dest->index);
- bitmap_set_bit (visited, e->dest->index);
- thread_around_empty_blocks (taken_edge,
- dummy_cond,
- handle_dominating_asserts,
- simplify,
- visited,
- path,
- backedge_seen_p);
- }
+ VRP.
+
+ We don't want to thread back to a block we have already
+ visited. This may be overly conservative. */
+ bitmap_set_bit (visited, dest->index);
+ bitmap_set_bit (visited, e->dest->index);
+ thread_around_empty_blocks (taken_edge,
+ dummy_cond,
+ handle_dominating_asserts,
+ simplify,
+ visited,
+ path,
+ backedge_seen_p);
return true;
}
}
@@ -1015,6 +1066,8 @@ thread_across_edge (gimple dummy_cond,
tree (*simplify) (gimple, gimple))
{
bitmap visited = BITMAP_ALLOC (NULL);
+ bitmap src_map = BITMAP_ALLOC (NULL);
+ bitmap dst_map = BITMAP_ALLOC (NULL);
bool backedge_seen;
stmt_count = 0;
@@ -1024,14 +1077,19 @@ thread_across_edge (gimple dummy_cond,
bitmap_set_bit (visited, e->src->index);
bitmap_set_bit (visited, e->dest->index);
backedge_seen = ((e->flags & EDGE_DFS_BACK) != 0);
+ if (backedge_seen)
+ simplify = dummy_simplify;
+
if (thread_through_normal_block (e, dummy_cond, handle_dominating_asserts,
stack, simplify, path, visited,
- &backedge_seen))
+ &backedge_seen, src_map, dst_map))
{
propagate_threaded_block_debug_into (path->last ()->e->dest,
e->dest);
remove_temporary_equivalences (stack);
BITMAP_FREE (visited);
+ BITMAP_FREE (src_map);
+ BITMAP_FREE (dst_map);
register_jump_thread (path);
return;
}
@@ -1066,15 +1124,26 @@ thread_across_edge (gimple dummy_cond,
{
remove_temporary_equivalences (stack);
BITMAP_FREE (visited);
+ BITMAP_FREE (src_map);
+ BITMAP_FREE (dst_map);
return;
}
+ /* We need to restore the state of the maps to this point each loop
+ iteration. */
+ bitmap src_map_copy = BITMAP_ALLOC (NULL);
+ bitmap dst_map_copy = BITMAP_ALLOC (NULL);
+ bitmap_copy (src_map_copy, src_map);
+ bitmap_copy (dst_map_copy, dst_map);
+
/* Look at each successor of E->dest to see if we can thread through it.
*/
FOR_EACH_EDGE (taken_edge, ei, e->dest->succs)
{
/* Push a fresh marker so we can unwind the equivalences created
for each of E->dest's successors. */
stack->safe_push (NULL_TREE);
+ bitmap_copy (src_map_copy, src_map);
+ bitmap_copy (dst_map_copy, dst_map);
/* Avoid threading to any block we have already visited. */
bitmap_clear (visited);
@@ -1093,23 +1162,25 @@ thread_across_edge (gimple dummy_cond,
found = false;
backedge_seen = ((e->flags & EDGE_DFS_BACK) != 0);
backedge_seen |= ((taken_edge->flags & EDGE_DFS_BACK) != 0);
- if (!backedge_seen
- || ! cond_arg_set_in_bb (path->last ()->e, e->dest))
- found = thread_around_empty_blocks (taken_edge,
- dummy_cond,
- handle_dominating_asserts,
- simplify,
- visited,
- path,
- &backedge_seen);
-
- if (!found
- && (!backedge_seen
- || ! cond_arg_set_in_bb (path->last ()->e, e->dest)))
+ if (backedge_seen)
+ simplify = dummy_simplify;
+ found = thread_around_empty_blocks (taken_edge,
+ dummy_cond,
+ handle_dominating_asserts,
+ simplify,
+ visited,
+ path,
+ &backedge_seen);
+
+ if (backedge_seen)
+ simplify = dummy_simplify;
+
+ if (!found)
found = thread_through_normal_block (path->last ()->e, dummy_cond,
handle_dominating_asserts,
stack, simplify, path, visited,
- &backedge_seen);
+ &backedge_seen,
+ src_map, dst_map);
/* If we were able to thread through a successor of E->dest, then
record the jump threading opportunity. */
@@ -1128,6 +1199,10 @@ thread_across_edge (gimple dummy_cond,
remove_temporary_equivalences (stack);
}
BITMAP_FREE (visited);
+ BITMAP_FREE (src_map);
+ BITMAP_FREE (dst_map);
+ BITMAP_FREE (src_map_copy);
+ BITMAP_FREE (dst_map_copy);
}
remove_temporary_equivalences (stack);