On Sat, 9 Apr 2022, Jakub Jelinek wrote: > Hi! > > Here is an attempt to resolve a P1 regression, where due to threading > changes we no longer optimize > bool foo(int i) { > while (i == 4) > i += 2; > return i; > } > to just return i != 0; by enhancing the phiopt value_replacement > optimization. Normally it will optimize x != cst1 ? x : cst1 to x. > Here we extend it to also optimize x != cst1 ? x : cst2 to x if > it (phi result) has a single immediate use which is a comparison > with some INTEGER_CST cst3 and we can prove that we don't care > whether x is cst1 or cst2 because both compare the same against cst3.
Hmm, it feels a bit awkward to key this on the middle-man (but I guess that is what you get from catching this in phiopt). In reality we are optimizing the final comparison with sth like (cmp (cond (ne @0 INTEGER_CST@1) @0 INTEGER_CST@2) INTEGER_CST@3) to (cmp @0 @3) with matching the PHI node as a (cond ...). But unless we want to "fold" all stmts in phiopt (we could fold all conds ...), that's not something we can do in phiopt - possibly forwprop could be enhanced to match PHIs that way (with special constraints). There's some long-time TODO to experiment with gimple-match.cc to match PHIs directly, but I've never got up to that and I think it does need special constraining to not pessimize things and not expose defs not on the dominating path. > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? So I'm not very happy with the middle-man here. Wrt forwprop matching the PHI I was thinking of hacking the valueize hook passed to fold_stmt to do sth like if (gphi *phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT (name))) { if (!phi has a single use, and the CFG is a simple diamond with basically no-op middle blocks) return name; gassign *tem = XOBALLOC (...); // on some temporary obstack gimple_init (tem, GIMPLE_ASSIGN, 4); gimple_assing_set_rhs_code (tem, COND_EXPR); ... see how maybe_fold_comparisons_from_match_pd builds a fake stmt tree fake_def = XOBNEW (...); .. init SSA name .. gimple_assign_set_lhs (tem, fake_def); return fake_def; } and forwprop setting up the obstack and scrap it later. maybe_fold_comparisons_from_match_pd gets away doing this on the stack, but the above is from callbacks in match-and-simplify. The only ugly thing is that we possibly get called multiple times on the same SSA name, so either we cache the built temps or just create them multiple times. We also have to look out for the fake SSA defs in the result and either replace them with the original PHIs (just re-use the SSA_NAME_VERSION maybe and check that SSA_NAME (i) is the same?) or give up. Likewise for defs in the middle-blocks ending up referenced in the result. Another possibility is to pre-compute the COND_EXPRs for suitable blocks during the forwprop CFG walk once and have them recorded somewhere (maybe in the copy lattice itself, and specially marked, possibly with SSA_NAME_OCCURS_IN_ABNORMAL_PHI so they won't ever be propagated and any result refering them scrapped). But that all feels like stage1 material to me. So for GCC 12 the middle-man approach in your patch is probably the way to go. So ... OK. Thanks, Richard. > 2022-04-09 Jakub Jelinek <ja...@redhat.com> > > PR tree-optimization/104639 > * tree-ssa-phiopt.cc: Include tree-ssa-propagate.h. > (value_replacement): Optimize (x != cst1 ? x : cst2) != cst3 > into x != cst3. > > * gcc.dg/tree-ssa/pr104639-1.c: New test. > * gcc.dg/tree-ssa/pr104639-2.c: New test. > > --- gcc/tree-ssa-phiopt.cc.jj 2022-04-07 17:18:14.476882106 +0200 > +++ gcc/tree-ssa-phiopt.cc 2022-04-08 17:49:30.892920502 +0200 > @@ -52,6 +52,7 @@ along with GCC; see the file COPYING3. > #include "gimple-range.h" > #include "gimple-match.h" > #include "dbgcnt.h" > +#include "tree-ssa-propagate.h" > > static unsigned int tree_ssa_phiopt_worker (bool, bool, bool); > static bool two_value_replacement (basic_block, basic_block, edge, gphi *, > @@ -1327,7 +1328,17 @@ value_replacement (basic_block cond_bb, > We now need to verify that the two arguments in the PHI node match > the two arguments to the equality comparison. */ > > - if (operand_equal_for_value_replacement (arg0, arg1, &code, cond)) > + bool equal_p = operand_equal_for_value_replacement (arg0, arg1, &code, > cond); > + bool maybe_equal_p = false; > + if (!equal_p > + && empty_or_with_defined_p > + && TREE_CODE (gimple_cond_rhs (cond)) == INTEGER_CST > + && (operand_equal_for_phi_arg_p (gimple_cond_lhs (cond), arg0) > + ? TREE_CODE (arg1) == INTEGER_CST > + : (operand_equal_for_phi_arg_p (gimple_cond_lhs (cond), arg1) > + && TREE_CODE (arg0) == INTEGER_CST))) > + maybe_equal_p = true; > + if (equal_p || maybe_equal_p) > { > edge e; > tree arg; > @@ -1358,11 +1369,123 @@ value_replacement (basic_block cond_bb, > && single_non_singleton_phi_for_edges (phi_nodes (gimple_bb (phi)), > e0, e1) == phi) > { > - replace_phi_edge_with_variable (cond_bb, e1, phi, arg); > - /* Note that we optimized this PHI. */ > - return 2; > + use_operand_p use_p; > + gimple *use_stmt; > + > + /* Even if arg0/arg1 isn't equal to second operand of cond, we > + can optimize away the bb if we can prove it doesn't care whether > + phi result is arg0/arg1 or second operand of cond. Consider: > + <bb 2> [local count: 118111600]: > + if (i_2(D) == 4) > + goto <bb 4>; [97.00%] > + else > + goto <bb 3>; [3.00%] > + > + <bb 3> [local count: 3540129]: > + > + <bb 4> [local count: 118111600]: > + # i_6 = PHI <i_2(D)(3), 6(2)> > + _3 = i_6 != 0; > + Here, carg is 4, oarg is 6, crhs is 0, and because > + (4 != 0) == (6 != 0), we don't care if i_6 is 4 or 6, both > + have the same outcome. So, can can optimize this to: > + _3 = i_2(D) != 0; > + If the single imm use of phi result >, >=, < or <=, similarly > + we can check if both carg and oarg compare the same against > + crhs using ccode. */ > + if (maybe_equal_p > + && TREE_CODE (arg) != INTEGER_CST > + && single_imm_use (gimple_phi_result (phi), &use_p, &use_stmt)) > + { > + enum tree_code ccode = ERROR_MARK; > + tree clhs = NULL_TREE, crhs = NULL_TREE; > + tree carg = gimple_cond_rhs (cond); > + tree oarg = e0 == e ? arg1 : arg0; > + if (is_gimple_assign (use_stmt) > + && (TREE_CODE_CLASS (gimple_assign_rhs_code (use_stmt)) > + == tcc_comparison)) > + { > + ccode = gimple_assign_rhs_code (use_stmt); > + clhs = gimple_assign_rhs1 (use_stmt); > + crhs = gimple_assign_rhs2 (use_stmt); > + } > + else if (gimple_code (use_stmt) == GIMPLE_COND) > + { > + ccode = gimple_cond_code (use_stmt); > + clhs = gimple_cond_lhs (use_stmt); > + crhs = gimple_cond_rhs (use_stmt); > + } > + if (ccode != ERROR_MARK > + && clhs == gimple_phi_result (phi) > + && TREE_CODE (crhs) == INTEGER_CST) > + switch (ccode) > + { > + case EQ_EXPR: > + case NE_EXPR: > + if (!tree_int_cst_equal (crhs, carg) > + && !tree_int_cst_equal (crhs, oarg)) > + equal_p = true; > + break; > + case GT_EXPR: > + if (tree_int_cst_lt (crhs, carg) > + == tree_int_cst_lt (crhs, oarg)) > + equal_p = true; > + break; > + case GE_EXPR: > + if (tree_int_cst_le (crhs, carg) > + == tree_int_cst_le (crhs, oarg)) > + equal_p = true; > + break; > + case LT_EXPR: > + if (tree_int_cst_lt (carg, crhs) > + == tree_int_cst_lt (oarg, crhs)) > + equal_p = true; > + break; > + case LE_EXPR: > + if (tree_int_cst_le (carg, crhs) > + == tree_int_cst_le (oarg, crhs)) > + equal_p = true; > + break; > + default: > + break; > + } > + if (equal_p && MAY_HAVE_DEBUG_BIND_STMTS) > + { > + imm_use_iterator imm_iter; > + tree phires = gimple_phi_result (phi); > + tree temp = NULL_TREE; > + > + /* Add # DEBUG D#1 => arg != carg ? arg : oarg. */ > + FOR_EACH_IMM_USE_STMT (use_stmt, imm_iter, phires) > + { > + if (!is_gimple_debug (use_stmt)) > + continue; > + if (temp == NULL_TREE) > + { > + gimple_stmt_iterator gsi > + = gsi_after_labels (gimple_bb (phi)); > + tree type = TREE_TYPE (phires); > + temp = build_debug_expr_decl (type); > + tree t = build2 (NE_EXPR, boolean_type_node, > + arg, carg); > + t = build3 (COND_EXPR, type, t, arg, oarg); > + gimple *g = gimple_build_debug_bind (temp, t, phi); > + gsi_insert_before (&gsi, g, GSI_SAME_STMT); > + } > + FOR_EACH_IMM_USE_ON_STMT (use_p, imm_iter) > + replace_exp (use_p, temp); > + update_stmt (use_stmt); > + } > + } > + } > + if (equal_p) > + { > + replace_phi_edge_with_variable (cond_bb, e1, phi, arg); > + /* Note that we optimized this PHI. */ > + return 2; > + } > } > - else > + else if (equal_p) > { > if (!single_pred_p (middle_bb)) > return 0; > --- gcc/testsuite/gcc.dg/tree-ssa/pr104639-1.c.jj 2022-04-08 > 17:58:09.723807501 +0200 > +++ gcc/testsuite/gcc.dg/tree-ssa/pr104639-1.c 2022-04-08 > 17:57:38.592234290 +0200 > @@ -0,0 +1,13 @@ > +/* PR tree-optimization/104639 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -g -fdump-tree-optimized" } */ > +/* { dg-final { scan-tree-dump-not "PHI <" "optimized" } } */ > +/* { dg-final { scan-tree-dump-times "i_\[0-9]*\\\(D\\\) != 0;" 1 > "optimized" } } */ > + > +_Bool > +foo (int i) > +{ > + while (i == 4) > + i += 2; > + return i; > +} > --- gcc/testsuite/gcc.dg/tree-ssa/pr104639-2.c.jj 2022-04-08 > 17:58:06.081857428 +0200 > +++ gcc/testsuite/gcc.dg/tree-ssa/pr104639-2.c 2022-04-08 > 17:56:28.759191647 +0200 > @@ -0,0 +1,54 @@ > +/* PR tree-optimization/104639 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fno-tree-pre -g -fdump-tree-optimized" } */ > +/* { dg-final { scan-tree-dump-not "PHI <" "optimized" } } */ > +/* { dg-final { scan-tree-dump-times "x_\[0-9]*\\\(D\\\) != 42;" 1 > "optimized" } } */ > +/* { dg-final { scan-tree-dump-times "y_\[0-9]*\\\(D\\\) > 6;" 1 "optimized" > } } */ > +/* { dg-final { scan-tree-dump-times "z_\[0-9]*\\\(D\\\) > 9;" 1 "optimized" > } } */ > +/* { dg-final { scan-tree-dump-times "u_\[0-9]*\\\(D\\\) <= 7;" 1 > "optimized" } } */ > +/* { dg-final { scan-tree-dump-times "v_\[0-9]*\\\(D\\\) <= 42;" 1 > "optimized" } } */ > + > +int > +f1 (int x) > +{ > + if (x == 4) > + x = 6; > + int xd = x; > + return x != 42; > +} > + > +int > +f2 (int y) > +{ > + if (y == 4) > + y = 6; > + int yd = y; > + return y > 6; > +} > + > +int > +f3 (int z) > +{ > + if (z == 4) > + z = 6; > + int zd = z; > + return z >= 10; > +} > + > +int > +f4 (int u) > +{ > + if (u == 4) > + u = 6; > + int ud = u; > + return u < 8; > +} > + > +int > +f5 (int v) > +{ > + if (v == 4) > + v = 6; > + int vd = v; > + return v <= 42; > +} > > Jakub > > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)