Re: [PATCH v3] c++: Fix compile-time-hog in cp_fold_immediate_r [PR111660]

Jason Merrill Thu, 19 Oct 2023 09:33:34 -0700

On 10/19/23 10:14, Marek Polacek wrote:

On Thu, Oct 19, 2023 at 10:06:01AM -0400, Jason Merrill wrote:

On 10/19/23 09:39, Patrick Palka wrote:

On Tue, 17 Oct 2023, Marek Polacek wrote:

On Tue, Oct 17, 2023 at 04:49:52PM -0400, Jason Merrill wrote:

On 10/16/23 20:39, Marek Polacek wrote:

On Sat, Oct 14, 2023 at 01:13:22AM -0400, Jason Merrill wrote:

On 10/13/23 14:53, Marek Polacek wrote:

On Thu, Oct 12, 2023 at 09:41:43PM -0400, Jason Merrill wrote:

On 10/12/23 17:04, Marek Polacek wrote:

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
My recent patch introducing cp_fold_immediate_r caused exponential
compile time with nested COND_EXPRs.  The problem is that the COND_EXPR
case recursively walks the arms of a COND_EXPR, but after processing
both arms it doesn't end the walk; it proceeds to walk the
sub-expressions of the outermost COND_EXPR, triggering again walking
the arms of the nested COND_EXPR, and so on.  This patch brings the
compile time down to about 0m0.033s.

I've added some debug prints to make sure that the rest of cp_fold_r
is still performed as before.

             PR c++/111660

gcc/cp/ChangeLog:

             * cp-gimplify.cc (cp_fold_immediate_r) <case COND_EXPR>: Return
             integer_zero_node instead of break;.
             (cp_fold_immediate): Return true if cp_fold_immediate_r returned
             error_mark_node.

gcc/testsuite/ChangeLog:

             * g++.dg/cpp0x/hog1.C: New test.
---
      gcc/cp/cp-gimplify.cc             |  9 ++--
      gcc/testsuite/g++.dg/cpp0x/hog1.C | 77 +++++++++++++++++++++++++++++++
      2 files changed, 82 insertions(+), 4 deletions(-)
      create mode 100644 gcc/testsuite/g++.dg/cpp0x/hog1.C

diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc
index bdf6e5f98ff..ca622ca169a 100644
--- a/gcc/cp/cp-gimplify.cc
+++ b/gcc/cp/cp-gimplify.cc
@@ -1063,16 +1063,16 @@ cp_fold_immediate_r (tree *stmt_p, int *walk_subtrees, 
void *data_)
        break;
            if (TREE_OPERAND (stmt, 1)
          && cp_walk_tree (&TREE_OPERAND (stmt, 1), cp_fold_immediate_r, data,
-                          nullptr))
+                          nullptr) == error_mark_node)
        return error_mark_node;
            if (TREE_OPERAND (stmt, 2)
          && cp_walk_tree (&TREE_OPERAND (stmt, 2), cp_fold_immediate_r, data,
-                          nullptr))
+                          nullptr) == error_mark_node)
        return error_mark_node;
            /* We're done here.  Don't clear *walk_subtrees here though: we're 
called
         from cp_fold_r and we must let it recurse on the expression with
         cp_fold.  */
-      break;
+      return integer_zero_node;


I'm concerned this will end up missing something like

1 ? 1 : ((1 ? 1 : 1), immediate())

as the integer_zero_node from the inner ?: will prevent walk_tree from
looking any farther.


You are right.  The line above works as expected, but

      1 ? 1 : ((1 ? 1 : id (42)), id (i));

shows the problem (when the expression isn't used as an initializer).

Maybe we want to handle COND_EXPR in cp_fold_r instead of here?


I hope this version is better.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
My recent patch introducing cp_fold_immediate_r caused exponential
compile time with nested COND_EXPRs.  The problem is that the COND_EXPR
case recursively walks the arms of a COND_EXPR, but after processing
both arms it doesn't end the walk; it proceeds to walk the
sub-expressions of the outermost COND_EXPR, triggering again walking
the arms of the nested COND_EXPR, and so on.  This patch brings the
compile time down to about 0m0.033s.


Is this number still accurate for this version?


It is.  I ran time(1) a few more times and the results were 0m0.033s - 0m0.035s.
That said, ...

This change seems algorithmically better than the current code, but still
problematic: if we have nested COND_EXPR A/B/C/D/E, it looks like we will
end up cp_fold_immediate_r walking the arms of E five times, once for each
COND_EXPR.


...this is accurate.  I should have addressed the redundant folding in v2
even though the compilation is pretty much immediate.

What I was thinking by handling COND_EXPR in cp_fold_r was to cp_fold_r walk
its subtrees (or cp_fold_immediate_r if it's clear from op0 that the branch
isn't taken) so we can clear *walk_subtrees and we don't fold_imm walk a
node more than once.


I agree I should do better here.  How's this, then?  I've added
debug_generic_expr to cp_fold_immediate_r to see if it gets the same
expr multiple times and it doesn't seem to.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
My recent patch introducing cp_fold_immediate_r caused exponential
compile time with nested COND_EXPRs.  The problem is that the COND_EXPR
case recursively walks the arms of a COND_EXPR, but after processing
both arms it doesn't end the walk; it proceeds to walk the
sub-expressions of the outermost COND_EXPR, triggering again walking
the arms of the nested COND_EXPR, and so on.  This patch brings the
compile time down to about 0m0.030s.

The ff_fold_immediate flag is unused after this patch but since I'm
using it in the P2564 patch, I'm not removing it now.  Maybe at_eof
can be used instead and then we can remove ff_fold_immediate.

           PR c++/111660

gcc/cp/ChangeLog:

           * cp-gimplify.cc (cp_fold_immediate_r) <case COND_EXPR>: Don't
        handle it here.
           (cp_fold_r): Handle COND_EXPR here.

gcc/testsuite/ChangeLog:

           * g++.dg/cpp0x/hog1.C: New test.
        * g++.dg/cpp2a/consteval36.C: New test.
---
    gcc/cp/cp-gimplify.cc                    | 52 +++++++++-------
    gcc/testsuite/g++.dg/cpp0x/hog1.C        | 77 ++++++++++++++++++++++++
    gcc/testsuite/g++.dg/cpp2a/consteval36.C | 22 +++++++
    3 files changed, 128 insertions(+), 23 deletions(-)
    create mode 100644 gcc/testsuite/g++.dg/cpp0x/hog1.C
    create mode 100644 gcc/testsuite/g++.dg/cpp2a/consteval36.C

diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc
index bdf6e5f98ff..a282c3930a3 100644
--- a/gcc/cp/cp-gimplify.cc
+++ b/gcc/cp/cp-gimplify.cc
@@ -1052,27 +1052,6 @@ cp_fold_immediate_r (tree *stmt_p, int *walk_subtrees, 
void *data_)
      switch (TREE_CODE (stmt))
        {
-    /* Unfortunately we must handle code like
-        false ? bar () : 42
-       where we have to check bar too.  The cp_fold call in cp_fold_r could
-       fold the ?: into a constant before we see it here.  */
-    case COND_EXPR:
-      /* If we are called from cp_fold_immediate, we don't need to worry about
-        cp_fold folding away the COND_EXPR.  */
-      if (data->flags & ff_fold_immediate)
-       break;
-      if (TREE_OPERAND (stmt, 1)
-         && cp_walk_tree (&TREE_OPERAND (stmt, 1), cp_fold_immediate_r, data,
-                          nullptr))
-       return error_mark_node;
-      if (TREE_OPERAND (stmt, 2)
-         && cp_walk_tree (&TREE_OPERAND (stmt, 2), cp_fold_immediate_r, data,
-                          nullptr))
-       return error_mark_node;
-      /* We're done here.  Don't clear *walk_subtrees here though: we're called
-        from cp_fold_r and we must let it recurse on the expression with
-        cp_fold.  */
-      break;
        case PTRMEM_CST:
          if (TREE_CODE (PTRMEM_CST_MEMBER (stmt)) == FUNCTION_DECL
          && DECL_IMMEDIATE_FUNCTION_P (PTRMEM_CST_MEMBER (stmt)))
@@ -1162,8 +1141,35 @@ cp_fold_r (tree *stmt_p, int *walk_subtrees, void *data_)
      tree stmt = *stmt_p;
      enum tree_code code = TREE_CODE (stmt);
-  if (cxx_dialect > cxx17)
-    cp_fold_immediate_r (stmt_p, walk_subtrees, data);
+  if (cxx_dialect >= cxx20)
+    {
+      /* Unfortunately we must handle code like
+          false ? bar () : 42
+        where we have to check bar too.  The cp_fold call below could
+        fold the ?: into a constant before we've checked it.  */
+      if (code == COND_EXPR)
+       {
+         auto then_fn = cp_fold_r, else_fn = cp_fold_r;
+         /* See if we can figure out if either of the branches is dead.  If it
+            is, we don't need to do everything that cp_fold_r does.  */
+         tree cond = maybe_constant_value (TREE_OPERAND (stmt, 0));
+         if (integer_zerop (cond))
+           then_fn = cp_fold_immediate_r;
+         else if (TREE_CODE (cond) == INTEGER_CST)
+           else_fn = cp_fold_immediate_r;
+
+         cp_walk_tree (&TREE_OPERAND (stmt, 0), cp_fold_r, data, nullptr);


I wonder about doing this before maybe_constant_value, to hopefully reduce
the redundant calculations?  OK either way.


Yeah, I was toying with that, I had

    foo() ? x : y

where foo was a constexpr function but the cp_fold_r on op0 didn't eval it
to a constant :(.


IIUC that's because cp_fold evaluates constexpr calls only with -O, so
doing cp_fold_r before maybe_constant_value on the condition should
still be desirable in that case?


Hmm, and if the cp_fold_r doesn't reduce the test to a constant, I would
think that folding the COND_EXPR also won't discard the other branch, so we
shouldn't need to work harder to get a constant here, so we don't need to
call maybe_constant_value at all?


Sorry, I hadn't seen this message when I posted the tweak.  But the
maybe_constant_value call on TREE_OPERAND (stmt, 0) should still make
sense because it can render a branch dead even on -O0.  Right?

But if op0 isn't constant after cp_fold, folding the COND_EXPR won'tdiscard the branch, so we don't need to do the extra work to find outthat it's actually dead.


Jason

Re: [PATCH v3] c++: Fix compile-time-hog in cp_fold_immediate_r [PR111660]

Reply via email to