On Sat, 9 Apr 2022, Jakub Jelinek wrote:

> Hi!
> 
> Here is an attempt to resolve a P1 regression, where due to threading
> changes we no longer optimize
> bool foo(int i) {
>     while (i == 4)
>         i += 2;
>     return i;
> }
> to just return i != 0; by enhancing the phiopt value_replacement
> optimization.  Normally it will optimize x != cst1 ? x : cst1 to x.
> Here we extend it to also optimize x != cst1 ? x : cst2 to x if
> it (phi result) has a single immediate use which is a comparison
> with some INTEGER_CST cst3 and we can prove that we don't care
> whether x is cst1 or cst2 because both compare the same against cst3.

Hmm, it feels a bit awkward to key this on the middle-man (but I guess
that is what you get from catching this in phiopt).  In reality we
are optimizing the final comparison with sth like

 (cmp (cond (ne @0 INTEGER_CST@1) @0 INTEGER_CST@2) INTEGER_CST@3)

to

 (cmp @0 @3)

with matching the PHI node as a (cond ...).  But unless we want
to "fold" all stmts in phiopt (we could fold all conds ...), that's
not something we can do in phiopt - possibly forwprop could be enhanced to
match PHIs that way (with special constraints).  There's some
long-time TODO to experiment with gimple-match.cc to match PHIs
directly, but I've never got up to that and I think it does need
special constraining to not pessimize things and not expose defs
not on the dominating path.

> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

So I'm not very happy with the middle-man here.

Wrt forwprop matching the PHI I was thinking of hacking the valueize
hook passed to fold_stmt to do sth like

  if (gphi *phi = dyn_cast <gphi *> (SSA_NAME_DEF_STMT (name)))
    {
      if (!phi has a single use, and the CFG is a simple diamond with
           basically no-op middle blocks)
        return name;
      gassign *tem = XOBALLOC (...); // on some temporary obstack
      gimple_init (tem, GIMPLE_ASSIGN, 4);
      gimple_assing_set_rhs_code (tem, COND_EXPR);
... see how maybe_fold_comparisons_from_match_pd builds a fake stmt
      tree fake_def = XOBNEW (...);
      .. init SSA name ..
      gimple_assign_set_lhs (tem, fake_def);
      return fake_def;
    }

and forwprop setting up the obstack and scrap it later.
maybe_fold_comparisons_from_match_pd gets away doing this on the stack,
but the above is from callbacks in match-and-simplify.  The only
ugly thing is that we possibly get called multiple times on the same
SSA name, so either we cache the built temps or just create them
multiple times.  We also have to look out for the fake SSA defs
in the result and either replace them with the original PHIs
(just re-use the SSA_NAME_VERSION maybe and check that SSA_NAME (i)
is the same?) or give up.  Likewise for defs in the middle-blocks
ending up referenced in the result.

Another possibility is to pre-compute the COND_EXPRs for suitable
blocks during the forwprop CFG walk once and have them recorded
somewhere (maybe in the copy lattice itself, and specially marked,
possibly with SSA_NAME_OCCURS_IN_ABNORMAL_PHI so they won't ever
be propagated and any result refering them scrapped).

But that all feels like stage1 material to me.

So for GCC 12 the middle-man approach in your patch is probably
the way to go.

So ...

OK.

Thanks,
Richard.

> 2022-04-09  Jakub Jelinek  <ja...@redhat.com>
> 
>       PR tree-optimization/104639
>       * tree-ssa-phiopt.cc: Include tree-ssa-propagate.h.
>       (value_replacement): Optimize (x != cst1 ? x : cst2) != cst3
>       into x != cst3.
> 
>       * gcc.dg/tree-ssa/pr104639-1.c: New test.
>       * gcc.dg/tree-ssa/pr104639-2.c: New test.
> 
> --- gcc/tree-ssa-phiopt.cc.jj 2022-04-07 17:18:14.476882106 +0200
> +++ gcc/tree-ssa-phiopt.cc    2022-04-08 17:49:30.892920502 +0200
> @@ -52,6 +52,7 @@ along with GCC; see the file COPYING3.
>  #include "gimple-range.h"
>  #include "gimple-match.h"
>  #include "dbgcnt.h"
> +#include "tree-ssa-propagate.h"
>  
>  static unsigned int tree_ssa_phiopt_worker (bool, bool, bool);
>  static bool two_value_replacement (basic_block, basic_block, edge, gphi *,
> @@ -1327,7 +1328,17 @@ value_replacement (basic_block cond_bb,
>       We now need to verify that the two arguments in the PHI node match
>       the two arguments to the equality comparison.  */
>  
> -  if (operand_equal_for_value_replacement (arg0, arg1, &code, cond))
> +  bool equal_p = operand_equal_for_value_replacement (arg0, arg1, &code, 
> cond);
> +  bool maybe_equal_p = false;
> +  if (!equal_p
> +      && empty_or_with_defined_p
> +      && TREE_CODE (gimple_cond_rhs (cond)) == INTEGER_CST
> +      && (operand_equal_for_phi_arg_p (gimple_cond_lhs (cond), arg0)
> +       ? TREE_CODE (arg1) == INTEGER_CST
> +       : (operand_equal_for_phi_arg_p (gimple_cond_lhs (cond), arg1)
> +          && TREE_CODE (arg0) == INTEGER_CST)))
> +    maybe_equal_p = true;
> +  if (equal_p || maybe_equal_p)
>      {
>        edge e;
>        tree arg;
> @@ -1358,11 +1369,123 @@ value_replacement (basic_block cond_bb,
>         && single_non_singleton_phi_for_edges (phi_nodes (gimple_bb (phi)),
>                                                e0, e1) == phi)
>       {
> -          replace_phi_edge_with_variable (cond_bb, e1, phi, arg);
> -       /* Note that we optimized this PHI.  */
> -       return 2;
> +       use_operand_p use_p;
> +       gimple *use_stmt;
> +
> +       /* Even if arg0/arg1 isn't equal to second operand of cond, we
> +          can optimize away the bb if we can prove it doesn't care whether
> +          phi result is arg0/arg1 or second operand of cond.  Consider:
> +          <bb 2> [local count: 118111600]:
> +          if (i_2(D) == 4)
> +            goto <bb 4>; [97.00%]
> +          else
> +            goto <bb 3>; [3.00%]
> +
> +          <bb 3> [local count: 3540129]:
> +
> +          <bb 4> [local count: 118111600]:
> +          # i_6 = PHI <i_2(D)(3), 6(2)>
> +          _3 = i_6 != 0;
> +          Here, carg is 4, oarg is 6, crhs is 0, and because
> +          (4 != 0) == (6 != 0), we don't care if i_6 is 4 or 6, both
> +          have the same outcome.  So, can can optimize this to:
> +          _3 = i_2(D) != 0;
> +          If the single imm use of phi result >, >=, < or <=, similarly
> +          we can check if both carg and oarg compare the same against
> +          crhs using ccode.  */
> +       if (maybe_equal_p
> +           && TREE_CODE (arg) != INTEGER_CST
> +           && single_imm_use (gimple_phi_result (phi), &use_p, &use_stmt))
> +         {
> +           enum tree_code ccode = ERROR_MARK;
> +           tree clhs = NULL_TREE, crhs = NULL_TREE;
> +           tree carg = gimple_cond_rhs (cond);
> +           tree oarg = e0 == e ? arg1 : arg0;
> +           if (is_gimple_assign (use_stmt)
> +               && (TREE_CODE_CLASS (gimple_assign_rhs_code (use_stmt))
> +                   == tcc_comparison))
> +             {
> +               ccode = gimple_assign_rhs_code (use_stmt);
> +               clhs = gimple_assign_rhs1 (use_stmt);
> +               crhs = gimple_assign_rhs2 (use_stmt);
> +             }
> +           else if (gimple_code (use_stmt) == GIMPLE_COND)
> +             {
> +               ccode = gimple_cond_code (use_stmt);
> +               clhs = gimple_cond_lhs (use_stmt);
> +               crhs = gimple_cond_rhs (use_stmt);
> +             }
> +           if (ccode != ERROR_MARK
> +               && clhs == gimple_phi_result (phi)
> +               && TREE_CODE (crhs) == INTEGER_CST)
> +             switch (ccode)
> +               {
> +               case EQ_EXPR:
> +               case NE_EXPR:
> +                 if (!tree_int_cst_equal (crhs, carg)
> +                     && !tree_int_cst_equal (crhs, oarg))
> +                   equal_p = true;
> +                 break;
> +               case GT_EXPR:
> +                 if (tree_int_cst_lt (crhs, carg)
> +                     == tree_int_cst_lt (crhs, oarg))
> +                   equal_p = true;
> +                 break;
> +               case GE_EXPR:
> +                 if (tree_int_cst_le (crhs, carg)
> +                     == tree_int_cst_le (crhs, oarg))
> +                   equal_p = true;
> +                 break;
> +               case LT_EXPR:
> +                 if (tree_int_cst_lt (carg, crhs)
> +                     == tree_int_cst_lt (oarg, crhs))
> +                   equal_p = true;
> +                 break;
> +               case LE_EXPR:
> +                 if (tree_int_cst_le (carg, crhs)
> +                     == tree_int_cst_le (oarg, crhs))
> +                   equal_p = true;
> +                 break;
> +               default:
> +                 break;
> +               }
> +           if (equal_p && MAY_HAVE_DEBUG_BIND_STMTS)
> +             {
> +               imm_use_iterator imm_iter;
> +               tree phires = gimple_phi_result (phi);
> +               tree temp = NULL_TREE;
> +
> +               /* Add # DEBUG D#1 => arg != carg ? arg : oarg.  */
> +               FOR_EACH_IMM_USE_STMT (use_stmt, imm_iter, phires)
> +                 {
> +                   if (!is_gimple_debug (use_stmt))
> +                     continue;
> +                   if (temp == NULL_TREE)
> +                     {
> +                       gimple_stmt_iterator gsi
> +                         = gsi_after_labels (gimple_bb (phi));
> +                       tree type = TREE_TYPE (phires);
> +                       temp = build_debug_expr_decl (type);
> +                       tree t = build2 (NE_EXPR, boolean_type_node,
> +                                        arg, carg);
> +                       t = build3 (COND_EXPR, type, t, arg, oarg);
> +                       gimple *g = gimple_build_debug_bind (temp, t, phi);
> +                       gsi_insert_before (&gsi, g, GSI_SAME_STMT);
> +                     }
> +                   FOR_EACH_IMM_USE_ON_STMT (use_p, imm_iter)
> +                     replace_exp (use_p, temp);
> +                   update_stmt (use_stmt);
> +                 }
> +             }
> +         }
> +       if (equal_p)
> +         {
> +           replace_phi_edge_with_variable (cond_bb, e1, phi, arg);
> +           /* Note that we optimized this PHI.  */
> +           return 2;
> +         }
>       }
> -      else
> +      else if (equal_p)
>       {
>         if (!single_pred_p (middle_bb))
>           return 0;
> --- gcc/testsuite/gcc.dg/tree-ssa/pr104639-1.c.jj     2022-04-08 
> 17:58:09.723807501 +0200
> +++ gcc/testsuite/gcc.dg/tree-ssa/pr104639-1.c        2022-04-08 
> 17:57:38.592234290 +0200
> @@ -0,0 +1,13 @@
> +/* PR tree-optimization/104639 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -g -fdump-tree-optimized" } */
> +/* { dg-final { scan-tree-dump-not "PHI <" "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "i_\[0-9]*\\\(D\\\) != 0;" 1 
> "optimized" } } */
> +
> +_Bool
> +foo (int i)
> +{
> +  while (i == 4)
> +    i += 2;
> +  return i;
> +}
> --- gcc/testsuite/gcc.dg/tree-ssa/pr104639-2.c.jj     2022-04-08 
> 17:58:06.081857428 +0200
> +++ gcc/testsuite/gcc.dg/tree-ssa/pr104639-2.c        2022-04-08 
> 17:56:28.759191647 +0200
> @@ -0,0 +1,54 @@
> +/* PR tree-optimization/104639 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fno-tree-pre -g -fdump-tree-optimized" } */
> +/* { dg-final { scan-tree-dump-not "PHI <" "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "x_\[0-9]*\\\(D\\\) != 42;" 1 
> "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "y_\[0-9]*\\\(D\\\) > 6;" 1 "optimized" 
> } } */
> +/* { dg-final { scan-tree-dump-times "z_\[0-9]*\\\(D\\\) > 9;" 1 "optimized" 
> } } */
> +/* { dg-final { scan-tree-dump-times "u_\[0-9]*\\\(D\\\) <= 7;" 1 
> "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "v_\[0-9]*\\\(D\\\) <= 42;" 1 
> "optimized" } } */
> +
> +int
> +f1 (int x)
> +{
> +  if (x == 4)
> +    x = 6;
> +  int xd = x;
> +  return x != 42;
> +}
> +
> +int
> +f2 (int y)
> +{
> +  if (y == 4)
> +    y = 6;
> +  int yd = y;
> +  return y > 6;
> +}
> +
> +int
> +f3 (int z)
> +{
> +  if (z == 4)
> +    z = 6;
> +  int zd = z;
> +  return z >= 10;
> +}
> +
> +int
> +f4 (int u)
> +{
> +  if (u == 4)
> +    u = 6;
> +  int ud = u;
> +  return u < 8;
> +}
> +
> +int
> +f5 (int v)
> +{
> +  if (v == 4)
> +    v = 6;
> +  int vd = v;
> +  return v <= 42;
> +}
> 
>       Jakub
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)

Reply via email to