On Tue, Apr 19, 2016 at 7:50 PM, Patrick Palka <patr...@parcs.ath.cx> wrote: > This patch makes the jump threader look through the BIT_AND_EXPRs and > BIT_IOR_EXPRs within a condition so that we could find dominating > ASSERT_EXPRs that could help make the overall condition evaluate to a > constant. For example, we currently don't perform any jump threading in > the following test case even though it's known that if the code calls > foo() then it can't possibly call bar() afterwards: > > void > baz_1 (int a, int b, int c) > { > if (a && b) > foo (); > if (!b && c) > bar (); > } > > <bb 2>: > _4 = a_3(D) != 0; > _6 = b_5(D) != 0; > _7 = _4 & _6; > if (_7 != 0) > goto <bb 3>; > else > goto <bb 4>; > > <bb 3>: > b_15 = ASSERT_EXPR <b_5(D), b_5(D) != 0>; > foo (); > > <bb 4>: > _10 = b_5(D) == 0; > _12 = c_11(D) != 0; > _13 = _10 & _12; > if (_13 != 0) > goto <bb 5>; > else > goto <bb 6>; > > <bb 5>: > bar (); > > <bb 6>: > return; > > So we here miss a jump threading opportunity that would have made bb 3 jump > straight to bb 6 instead of falling through to bb 4. > > If we inspect the operands of the BIT_AND_EXPR of _13 we'll notice that > there is an ASSERT_EXPR that says its left operand b_5 is non-zero. We > could use this ASSERT_EXPR to deduce that the condition (_13 != 0) is > always false. This is what this patch does, basically by making > simplify_control_stmt_condition recurse into BIT_AND_EXPRs and > BIT_IOR_EXPRs. > > Does this seem like a good idea/approach? > > Notes: > > 1. This patch introduces a "regression" in gcc.dg/tree-ssa/ssa-thread-11.c > in that we no longer perform FSM threading during vrp2 but instead we > detect two new jump threading opportunities during vrp1. Not sure if > the new code is better but it is shorter. I wonder how this should be > resolved...
Try adjusting the testcase so that it performs the FSM threading again or adjust the expected outcome... > 2. I haven't tested the performance impact of this patch. What would be > a good way to do this? > > 3. According to my instrumentation, an older version of this change > added 4000 new threaded jumps during bootstrap. Looks reasonable to me. I wonder if we want to limit the amount of recursion done - we might get exponential explosion for, say _1 = a_1 < 0; _2 = b_3 < 0; _4 = c_5 < 0; _6 = _1 & _2; _7 = _6 & _4; _8 = _1 & _4; _9 = _7 & _6; ... that is, if the conditions don't form a tree but a graph and thus we visit some conditions multiple times (unnecessarily). reassoc might simplify this but first DOM/VRP runs before that. Richard. > gcc/ChangeLog: > > * tree-ssa-threadedge.c (simplify_control_stmt_condition): Split > out into ... > (simplify_control_stmt_condition_1): ... here. Recurse into > BIT_AND_EXPRs and BIT_IOR_EXPRs. > > gcc/testsuite/ChangeLog: > > * gcc.dg/tree-ssa/ssa-thread-14.c: New test. > --- > gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-14.c | 81 +++++++++ > gcc/tree-ssa-threadedge.c | 249 > +++++++++++++++++++++----- > 2 files changed, 285 insertions(+), 45 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-14.c > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-14.c > b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-14.c > new file mode 100644 > index 0000000..db9ed3b > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-14.c > @@ -0,0 +1,81 @@ > +/* { dg-do compile } */ > +/* { dg-additional-options "-O2 -fdump-tree-vrp" } */ > +/* { dg-final { scan-tree-dump-times "Threaded jump" 8 "vrp1" } } */ > + > +void foo (void); > +void bar (void); > +void blah (void); > + > +/* One jump threaded here. */ > + > +void > +baz_1 (int a, int b, int c) > +{ > + if (a && b) > + foo (); > + if (!b && c) > + bar (); > +} > + > +/* One jump threaded here. */ > + > +void > +baz_2 (int a, int b, int c) > +{ > + if (a && b) > + foo (); > + if (b || c) > + bar (); > +} > + > +/* One jump threaded here. */ > + > +void > +baz_3 (int a, int b, int c) > +{ > + if (a && b > 10) > + foo (); > + if (b < 5 && c) > + bar (); > +} > + > +/* Two jumps threaded here. */ > + > +void > +baz_4 (int a, int b, int c) > +{ > + if (a && b) > + { > + foo (); > + if (c) > + bar (); > + } > + if (b && c) > + blah (); > +} > + > +/* Two jumps threaded here. */ > + > +void > +baz_5 (int a, int b, int c) > +{ > + if (a && b) > + { > + foo (); > + if (c) > + bar (); > + } > + if (!b || !c) > + blah (); > +} > + > +/* One jump threaded here. */ > + > +void > +baz_6 (int a, int b, int c) > +{ > + if (a == 39 && b == 41) > + foo (); > + if (c == 12 || b == 41) > + bar (); > +} > diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c > index f60be38..a4e8a26 100644 > --- a/gcc/tree-ssa-threadedge.c > +++ b/gcc/tree-ssa-threadedge.c > @@ -376,6 +376,12 @@ record_temporary_equivalences_from_stmts_at_dest (edge e, > return stmt; > } > > +static tree simplify_control_stmt_condition_1 (edge, gimple *, > + class avail_exprs_stack *, > + tree, enum tree_code, tree, > + gcond *, pfn_simplify, bool, > + unsigned); > + > /* Simplify the control statement at the end of the block E->dest. > > To avoid allocating memory unnecessarily, a scratch GIMPLE_COND > @@ -436,52 +442,14 @@ simplify_control_stmt_condition (edge e, > } > } > > - if (handle_dominating_asserts) > - { > - /* Now see if the operand was consumed by an ASSERT_EXPR > - which dominates E->src. If so, we want to replace the > - operand with the LHS of the ASSERT_EXPR. */ > - if (TREE_CODE (op0) == SSA_NAME) > - op0 = lhs_of_dominating_assert (op0, e->src, stmt); > - > - if (TREE_CODE (op1) == SSA_NAME) > - op1 = lhs_of_dominating_assert (op1, e->src, stmt); > - } > - > - /* We may need to canonicalize the comparison. For > - example, op0 might be a constant while op1 is an > - SSA_NAME. Failure to canonicalize will cause us to > - miss threading opportunities. */ > - if (tree_swap_operands_p (op0, op1, false)) > - { > - cond_code = swap_tree_comparison (cond_code); > - std::swap (op0, op1); > - } > + const unsigned recursion_limit = 4; > > - /* Stuff the operator and operands into our dummy conditional > - expression. */ > - gimple_cond_set_code (dummy_cond, cond_code); > - gimple_cond_set_lhs (dummy_cond, op0); > - gimple_cond_set_rhs (dummy_cond, op1); > - > - /* We absolutely do not care about any type conversions > - we only care about a zero/nonzero value. */ > - fold_defer_overflow_warnings (); > - > - cached_lhs = fold_binary (cond_code, boolean_type_node, op0, op1); > - if (cached_lhs) > - while (CONVERT_EXPR_P (cached_lhs)) > - cached_lhs = TREE_OPERAND (cached_lhs, 0); > - > - fold_undefer_overflow_warnings ((cached_lhs > - && is_gimple_min_invariant > (cached_lhs)), > - stmt, WARN_STRICT_OVERFLOW_CONDITIONAL); > - > - /* If we have not simplified the condition down to an invariant, > - then use the pass specific callback to simplify the condition. */ > - if (!cached_lhs > - || !is_gimple_min_invariant (cached_lhs)) > - cached_lhs = (*simplify) (dummy_cond, stmt, avail_exprs_stack); > + cached_lhs > + = simplify_control_stmt_condition_1 (e, stmt, avail_exprs_stack, > + op0, cond_code, op1, > + dummy_cond, simplify, > + handle_dominating_asserts, > + recursion_limit); > > /* If we were testing an integer/pointer against a constant, then > we can use the FSM code to trace the value of the SSA_NAME. If > @@ -560,6 +528,197 @@ simplify_control_stmt_condition (edge e, > return cached_lhs; > } > > +/* Recursive helper for simplify_control_stmt_condition. */ > + > +static tree > +simplify_control_stmt_condition_1 (edge e, > + gimple *stmt, > + class avail_exprs_stack *avail_exprs_stack, > + tree op0, > + enum tree_code cond_code, > + tree op1, > + gcond *dummy_cond, > + pfn_simplify simplify, > + bool handle_dominating_asserts, > + unsigned limit) > +{ > + if (limit == 0) > + return NULL_TREE; > + > + /* We may need to canonicalize the comparison. For > + example, op0 might be a constant while op1 is an > + SSA_NAME. Failure to canonicalize will cause us to > + miss threading opportunities. */ > + if (tree_swap_operands_p (op0, op1, false)) > + { > + cond_code = swap_tree_comparison (cond_code); > + std::swap (op0, op1); > + } > + > + /* If the condition has the form (A & B) CMP 0 or (A | B) CMP 0 then > + recurse into the LHS to see if there is a dominating ASSERT_EXPR > + of A or of B that makes this condition always true or always false > + along the edge E. */ > + if (handle_dominating_asserts > + && (cond_code == EQ_EXPR || cond_code == NE_EXPR) > + && TREE_CODE (op0) == SSA_NAME > + && integer_zerop (op1)) > + { > + gimple *def_stmt = SSA_NAME_DEF_STMT (op0); > + if (gimple_code (def_stmt) != GIMPLE_ASSIGN) > + ; > + else if (gimple_assign_rhs_code (def_stmt) == BIT_AND_EXPR > + || gimple_assign_rhs_code (def_stmt) == BIT_IOR_EXPR) > + { > + enum tree_code rhs_code = gimple_assign_rhs_code (def_stmt); > + const tree rhs1 = gimple_assign_rhs1 (def_stmt); > + const tree rhs2 = gimple_assign_rhs2 (def_stmt); > + const tree zero_cst = build_zero_cst (TREE_TYPE (op0)); > + const tree one_cst = build_one_cst (TREE_TYPE (op0)); > + > + /* Is A != 0 ? */ > + const tree res1 > + = simplify_control_stmt_condition_1 (e, def_stmt, > avail_exprs_stack, > + rhs1, NE_EXPR, op1, > + dummy_cond, simplify, > + handle_dominating_asserts, > + limit - 1); > + if (res1 == NULL_TREE) > + ; > + else if (rhs_code == BIT_AND_EXPR && integer_zerop (res1)) > + { > + /* If A == 0 then (A & B) != 0 is always false. */ > + if (cond_code == NE_EXPR) > + return zero_cst; > + /* If A == 0 then (A & B) == 0 is always true. */ > + if (cond_code == EQ_EXPR) > + return one_cst; > + } > + else if (rhs_code == BIT_IOR_EXPR && integer_nonzerop (res1)) > + { > + /* If A != 0 then (A | B) != 0 is always true. */ > + if (cond_code == NE_EXPR) > + return one_cst; > + /* If A != 0 then (A | B) == 0 is always false. */ > + if (cond_code == EQ_EXPR) > + return zero_cst; > + } > + > + /* Is B != 0 ? */ > + const tree res2 > + = simplify_control_stmt_condition_1 (e, def_stmt, > avail_exprs_stack, > + rhs2, NE_EXPR, op1, > + dummy_cond, simplify, > + handle_dominating_asserts, > + limit - 1); > + if (res2 == NULL_TREE) > + ; > + else if (rhs_code == BIT_AND_EXPR && integer_zerop (res2)) > + { > + /* If B == 0 then (A & B) != 0 is always false. */ > + if (cond_code == NE_EXPR) > + return zero_cst; > + /* If B == 0 then (A & B) == 0 is always true. */ > + if (cond_code == EQ_EXPR) > + return one_cst; > + } > + else if (rhs_code == BIT_IOR_EXPR && integer_nonzerop (res2)) > + { > + /* If B != 0 then (A | B) != 0 is always true. */ > + if (cond_code == NE_EXPR) > + return one_cst; > + /* If B != 0 then (A | B) == 0 is always false. */ > + if (cond_code == EQ_EXPR) > + return zero_cst; > + } > + > + if (res1 != NULL_TREE && res2 != NULL_TREE) > + { > + if (rhs_code == BIT_AND_EXPR > + && TYPE_PRECISION (TREE_TYPE (op0)) == 1 > + && integer_nonzerop (res1) > + && integer_nonzerop (res2)) > + { > + /* If A != 0 and B != 0 then (bool)(A | B) != 0 is true. */ > + if (cond_code == NE_EXPR) > + return one_cst; > + /* If A != 0 and B != 0 then (bool)(A | B) == 0 is false. > */ > + if (cond_code == EQ_EXPR) > + return zero_cst; > + } > + > + if (rhs_code == BIT_IOR_EXPR > + && integer_zerop (res1) > + && integer_zerop (res2)) > + { > + /* If A == 0 and B == 0 then (A | B) != 0 is false. */ > + if (cond_code == NE_EXPR) > + return zero_cst; > + /* If A == 0 and B == 0 then (A | B) == 0 is true. */ > + if (cond_code == EQ_EXPR) > + return one_cst; > + } > + } > + } > + /* Handle (A CMP B) CMP 0. */ > + else if (TREE_CODE_CLASS (gimple_assign_rhs_code (def_stmt)) > + == tcc_comparison) > + { > + tree rhs1 = gimple_assign_rhs1 (def_stmt); > + tree rhs2 = gimple_assign_rhs2 (def_stmt); > + > + tree_code new_cond = gimple_assign_rhs_code (def_stmt); > + if (cond_code == EQ_EXPR) > + new_cond = invert_tree_comparison (new_cond, false); > + > + tree res > + = simplify_control_stmt_condition_1 (e, def_stmt, > avail_exprs_stack, > + rhs1, new_cond, rhs2, > + dummy_cond, simplify, > + handle_dominating_asserts, > + limit - 1); > + if (res != NULL_TREE && is_gimple_min_invariant (res)) > + return res; > + } > + } > + > + if (handle_dominating_asserts) > + { > + /* Now see if the operand was consumed by an ASSERT_EXPR > + which dominates E->src. If so, we want to replace the > + operand with the LHS of the ASSERT_EXPR. */ > + if (TREE_CODE (op0) == SSA_NAME) > + op0 = lhs_of_dominating_assert (op0, e->src, stmt); > + > + if (TREE_CODE (op1) == SSA_NAME) > + op1 = lhs_of_dominating_assert (op1, e->src, stmt); > + } > + > + gimple_cond_set_code (dummy_cond, cond_code); > + gimple_cond_set_lhs (dummy_cond, op0); > + gimple_cond_set_rhs (dummy_cond, op1); > + > + /* We absolutely do not care about any type conversions > + we only care about a zero/nonzero value. */ > + fold_defer_overflow_warnings (); > + > + tree res = fold_binary (cond_code, boolean_type_node, op0, op1); > + if (res) > + while (CONVERT_EXPR_P (res)) > + res = TREE_OPERAND (res, 0); > + > + fold_undefer_overflow_warnings ((res && is_gimple_min_invariant (res)), > + stmt, WARN_STRICT_OVERFLOW_CONDITIONAL); > + > + /* If we have not simplified the condition down to an invariant, > + then use the pass specific callback to simplify the condition. */ > + if (!res > + || !is_gimple_min_invariant (res)) > + res = (*simplify) (dummy_cond, stmt, avail_exprs_stack); > + > + return res; > +} > + > /* Copy debug stmts from DEST's chain of single predecessors up to > SRC, so that we don't lose the bindings as PHI nodes are introduced > when DEST gains new predecessors. */ > -- > 2.8.1.231.g95ac767 >