On Wed, 18 Dec 2024, Jason Merrill wrote:

> On 12/17/24 10:43 AM, Patrick Palka wrote:
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look
> > OK for trunk?  Shall we also backport this to release branches?
> > It's not a regression but seems like a safe fix for an inconvenient
> > issue.
> 
> OK for trunk and 14.
> 
> I wonder about using __builtin_mul_overflow et al (wrapped in a "non-overflow
> integer" class template?) to fail better on an even more extreme testcase.

That sounds prudent because the testcase in the PR almost overflows even
when using unsigned HOST_WIDE_INT (and still overflows when using signed
HOST_WIDE_INT)!

Conveniently we already have three-parameter versions of add_hwi / mul_hwi
that we can use here to track overflow.  But rather than explicitly
tracking overflow which would be a bit cumbersome without some kind of
class template abstraction, it seems all we really need to saturating
addition/multiplication helpers that clamp an overflowed operation to
HOST_WIDE_INT_MAX.

To that end I added add_sat_hwi / mul_sat_hwi functions to hwint.h
(since they seem like generally useful operation) and used it in
cnf/dnf_size_r.  Like so?  Bootstrapped and regtested on
x86_64-pc-linux-gnu.

-- >8 --

Subject: [PATCH] c++: integer overflow during subsumption [PR118069]

For the testcase in the PR we hang during constraint subsumption
ultimately because one of the constraints is complex enough that its
conjunctive normal form is calculated to have more than 2^31 clauses,
which causes the size calculation (through an int) to overflow and so
the optimization in subsumes_constraints_nonnull

  if (dnf_size (lhs) <= cnf_size (rhs))
    // iterate over DNF of LHS
  else
    // iterate over CNF of RHS

incorrectly decides to loop over the CNF (billions of clauses) instead
of the DNF (thousands of clauses).

I haven't verified that the result of cnf_size is correct for the
problematic constraint but integer overflow is definitely plausible
given that CNF/DNF can be exponentially larger than the original
constraint in the worst case.

This patch fixes this by using a 64-bit saturating arithmetic during
these size calculations via new add/mul_sat_hwi functions so that
overflow is less likely and if it does occur we handle it gracefully.
It should be highly unlikely that both the DNF and CNF size calculations
overflow, and if they do then it doesn't matter which form we select,
subsumption will take forever either way. The testcase now compiles in
~3 seconds on my machine after this change.

        PR c++/118069

gcc/ChangeLog:

        * hwint.h (add_sat_hwi): New function.
        (mul_sat_hwi): Likewise.

gcc/cp/ChangeLog:

        * logic.cc (dnf_size_r): Use HOST_WIDE_INT instead of int, and
        handle overflow gracefully via add_sat_hwi and mul_sat_hwi.
        (cnf_size_r): Likewise.
        (dnf_size): Use HOST_WIDE_INT instead of int.
        (cnf_size): Likewise.
---
 gcc/cp/logic.cc | 68 +++++++++++++++++++++++++++----------------------
 gcc/hwint.h     | 26 +++++++++++++++++++
 2 files changed, 63 insertions(+), 31 deletions(-)

diff --git a/gcc/cp/logic.cc b/gcc/cp/logic.cc
index 9d8edb74099..6b4bf1dfb7d 100644
--- a/gcc/cp/logic.cc
+++ b/gcc/cp/logic.cc
@@ -349,7 +349,7 @@ atomic_p (tree t)
    distributing.  In general, a conjunction for which this flag is set
    is considered a disjunction for the purpose of counting.  */
 
-static std::pair<int, bool>
+static std::pair<HOST_WIDE_INT, bool>
 dnf_size_r (tree t)
 {
   if (atomic_p (t))
@@ -360,9 +360,9 @@ dnf_size_r (tree t)
      the results.  */
   tree lhs = TREE_OPERAND (t, 0);
   tree rhs = TREE_OPERAND (t, 1);
-  std::pair<int, bool> p1 = dnf_size_r (lhs);
-  std::pair<int, bool> p2 = dnf_size_r (rhs);
-  int n1 = p1.first, n2 = p2.first;
+  auto p1 = dnf_size_r (lhs);
+  auto p2 = dnf_size_r (rhs);
+  HOST_WIDE_INT n1 = p1.first, n2 = p2.first;
   bool d1 = p1.second, d2 = p2.second;
 
   if (disjunction_p (t))
@@ -376,22 +376,24 @@ dnf_size_r (tree t)
        {
          if (disjunction_p (rhs) || (conjunction_p (rhs) && d2))
            /* Both P and Q are disjunctions.  */
-           return std::make_pair (n1 + n2, d1 | d2);
+           return std::make_pair (add_sat_hwi (n1, n2), d1 | d2);
          else
            /* Only LHS is a disjunction.  */
-           return std::make_pair (1 + n1 + n2, d1 | d2);
+           return std::make_pair (add_sat_hwi (1, add_sat_hwi (n1, n2)),
+                                  d1 | d2);
          gcc_unreachable ();
        }
       if (conjunction_p (lhs))
        {
          if ((disjunction_p (rhs) && d1) || (conjunction_p (rhs) && d1 && d2))
            /* Both P and Q are disjunctions.  */
-           return std::make_pair (n1 + n2, d1 | d2);
+           return std::make_pair (add_sat_hwi (n1, n2), d1 | d2);
          if (disjunction_p (rhs)
              || (conjunction_p (rhs) && d1 != d2)
              || (atomic_p (rhs) && d1))
            /* Either LHS or RHS is a disjunction.  */
-           return std::make_pair (1 + n1 + n2, d1 | d2);
+           return std::make_pair (add_sat_hwi (1, add_sat_hwi (n1, n2)),
+                                  d1 | d2);
          else
            /* Neither LHS nor RHS is a disjunction.  */
            return std::make_pair (2, false);
@@ -400,7 +402,8 @@ dnf_size_r (tree t)
        {
          if (disjunction_p (rhs) || (conjunction_p (rhs) && d2))
            /* Only RHS is a disjunction.  */
-           return std::make_pair (1 + n1 + n2, d1 | d2);
+           return std::make_pair (add_sat_hwi (1, add_sat_hwi (n1, n2)),
+                                  d1 | d2);
          else
            /* Neither LHS nor RHS is a disjunction.  */
            return std::make_pair (2, false);
@@ -418,22 +421,22 @@ dnf_size_r (tree t)
        {
          if (disjunction_p (rhs) || (conjunction_p (rhs) && d2))
            /* Both P and Q are disjunctions.  */
-           return std::make_pair (n1 * n2, true);
+           return std::make_pair (mul_sat_hwi (n1, n2), true);
          else
            /* Only LHS is a disjunction.  */
-           return std::make_pair (n1 + n2, true);
+           return std::make_pair (add_sat_hwi (n1, n2), true);
          gcc_unreachable ();
        }
       if (conjunction_p (lhs))
        {
          if ((disjunction_p (rhs) && d1) || (conjunction_p (rhs) && d1 && d2))
            /* Both P and Q are disjunctions.  */
-           return std::make_pair (n1 * n2, true);
+           return std::make_pair (mul_sat_hwi (n1, n2), true);
          if (disjunction_p (rhs)
              || (conjunction_p (rhs) && d1 != d2)
              || (atomic_p (rhs) && d1))
            /* Either LHS or RHS is a disjunction.  */
-           return std::make_pair (n1 + n2, true);
+           return std::make_pair (add_sat_hwi (n1, n2), true);
          else
            /* Neither LHS nor RHS is a disjunction.  */
            return std::make_pair (0, false);
@@ -457,7 +460,7 @@ dnf_size_r (tree t)
    distributing.  In general, a disjunction for which this flag is set
    is considered a conjunction for the purpose of counting.  */
 
-static std::pair<int, bool>
+static std::pair<HOST_WIDE_INT, bool>
 cnf_size_r (tree t)
 {
   if (atomic_p (t))
@@ -468,9 +471,9 @@ cnf_size_r (tree t)
      the results.  */
   tree lhs = TREE_OPERAND (t, 0);
   tree rhs = TREE_OPERAND (t, 1);
-  std::pair<int, bool> p1 = cnf_size_r (lhs);
-  std::pair<int, bool> p2 = cnf_size_r (rhs);
-  int n1 = p1.first, n2 = p2.first;
+  auto p1 = cnf_size_r (lhs);
+  auto p2 = cnf_size_r (rhs);
+  HOST_WIDE_INT n1 = p1.first, n2 = p2.first;
   bool d1 = p1.second, d2 = p2.second;
 
   if (disjunction_p (t))
@@ -485,12 +488,12 @@ cnf_size_r (tree t)
        {
          if ((disjunction_p (rhs) && d1 && d2) || (conjunction_p (rhs) && d1))
            /* Both P and Q are conjunctions.  */
-           return std::make_pair (n1 * n2, true);
+           return std::make_pair (mul_sat_hwi (n1, n2), true);
          if ((disjunction_p (rhs) && d1 != d2)
              || conjunction_p (rhs)
              || (atomic_p (rhs) && d1))
            /* Either LHS or RHS is a conjunction.  */
-           return std::make_pair (n1 + n2, true);
+           return std::make_pair (add_sat_hwi (n1, n2), true);
          else
            /* Neither LHS nor RHS is a conjunction.  */
            return std::make_pair (0, false);
@@ -499,16 +502,16 @@ cnf_size_r (tree t)
        {
          if ((disjunction_p (rhs) && d2) || conjunction_p (rhs))
            /* Both LHS and RHS are conjunctions.  */
-           return std::make_pair (n1 * n2, true);
+           return std::make_pair (mul_sat_hwi (n1, n2), true);
          else
            /* Only LHS is a conjunction.  */
-           return std::make_pair (n1 + n2, true);
+           return std::make_pair (add_sat_hwi (n1, n2), true);
        }
       if (atomic_p (lhs))
        {
          if ((disjunction_p (rhs) && d2) || conjunction_p (rhs))
            /* Only RHS is a disjunction.  */
-           return std::make_pair (n1 + n2, true);
+           return std::make_pair (add_sat_hwi (n1, n2), true);
          else
            /* Neither LHS nor RHS is a disjunction.  */
            return std::make_pair (0, false);
@@ -525,12 +528,13 @@ cnf_size_r (tree t)
        {
          if ((disjunction_p (rhs) && d1 && d2) || (conjunction_p (rhs) && d1))
            /* Both P and Q are conjunctions.  */
-           return std::make_pair (n1 + n2, d1 | d2);
+           return std::make_pair (add_sat_hwi (n1, n2), d1 | d2);
          if ((disjunction_p (rhs) && d1 != d2)
              || conjunction_p (rhs)
              || (atomic_p (rhs) && d1))
            /* Either LHS or RHS is a conjunction.  */
-           return std::make_pair (1 + n1 + n2, d1 | d2);
+           return std::make_pair (add_sat_hwi (1, add_sat_hwi (n1, n2)),
+                                  d1 | d2);
          else
            /* Neither LHS nor RHS is a conjunction.  */
            return std::make_pair (2, false);
@@ -539,16 +543,18 @@ cnf_size_r (tree t)
        {
          if ((disjunction_p (rhs) && d2) || conjunction_p (rhs))
            /* Both LHS and RHS are conjunctions.  */
-           return std::make_pair (n1 + n2, d1 | d2);
+           return std::make_pair (add_sat_hwi (n1, n2), d1 | d2);
          else
            /* Only LHS is a conjunction.  */
-           return std::make_pair (1 + n1 + n2, d1 | d2);
+           return std::make_pair (add_sat_hwi (1, add_sat_hwi (n1, n2)),
+                                  d1 | d2);
        }
       if (atomic_p (lhs))
        {
          if ((disjunction_p (rhs) && d2) || conjunction_p (rhs))
            /* Only RHS is a disjunction.  */
-           return std::make_pair (1 + n1 + n2, d1 | d2);
+           return std::make_pair (add_sat_hwi (1, add_sat_hwi (n1, n2)),
+                                  d1 | d2);
          else
            /* Neither LHS nor RHS is a disjunction.  */
            return std::make_pair (2, false);
@@ -560,10 +566,10 @@ cnf_size_r (tree t)
 /* Count the number conjunctive clauses that would be created
    when rewriting T to DNF.  */
 
-static int
+static HOST_WIDE_INT
 dnf_size (tree t)
 {
-  std::pair<int, bool> result = dnf_size_r (t);
+  auto result = dnf_size_r (t);
   return result.first == 0 ? 1 : result.first;
 }
 
@@ -571,10 +577,10 @@ dnf_size (tree t)
 /* Count the number disjunctive clauses that would be created
    when rewriting T to CNF.  */
 
-static int
+static HOST_WIDE_INT
 cnf_size (tree t)
 {
-  std::pair<int, bool> result = cnf_size_r (t);
+  auto result = cnf_size_r (t);
   return result.first == 0 ? 1 : result.first;
 }
 
diff --git a/gcc/hwint.h b/gcc/hwint.h
index f69b61db9cc..e07a12a4097 100644
--- a/gcc/hwint.h
+++ b/gcc/hwint.h
@@ -397,4 +397,30 @@ mul_hwi (HOST_WIDE_INT a, HOST_WIDE_INT b, bool *overflow)
 #endif
 }
 
+/* Compute the saturated sum of signed A and B, i.e. upon overflow clamp
+   the result to the corresponding extremum.  */
+
+inline HOST_WIDE_INT
+add_sat_hwi (HOST_WIDE_INT a, HOST_WIDE_INT b)
+{
+  bool overflow;
+  HOST_WIDE_INT result = add_hwi (a, b, &overflow);
+  if (!overflow)
+    return result;
+  return (a < 0) ? HOST_WIDE_INT_MIN : HOST_WIDE_INT_MAX;
+}
+
+/* Compute the saturated product of signed A and B, i.e. upon overflow clamp
+   the result to the corresponding extremum.  */
+
+inline HOST_WIDE_INT
+mul_sat_hwi (HOST_WIDE_INT a, HOST_WIDE_INT b)
+{
+  bool overflow;
+  HOST_WIDE_INT result = mul_hwi (a, b, &overflow);
+  if (!overflow)
+    return result;
+  return ((a < 0) != (b < 0)) ? HOST_WIDE_INT_MIN : HOST_WIDE_INT_MAX;
+}
+
 #endif /* ! GCC_HWINT_H */
-- 
2.48.0.rc0




> 
> > -- >8 --
> > 
> > For the testcase in the PR we hang during constraint normalization
> > ultimately because one of the constraints is complex enough that its
> > conjunctive normal form is calculated to have more than 2^31 clauses,
> > which causes the size calculation (through an int) to overflow and so
> > the optimization in subsumes_constraints_nonnull
> > 
> >    if (dnf_size (lhs) <= cnf_size (rhs))
> >      // iterate over DNF of LHS
> >    else
> >      // iterate over CNF of RHS
> > 
> > incorrectly decides to loop over the CNF (billions of clauses) instead
> > of the DNF (thousands of clauses).
> > 
> > I haven't verified that the result of cnf_size is correct for the
> > problematic constraint but integer overflow is definitely plausible
> > given that CNF/DNF can be exponentially larger than the original
> > constraint in the worst case.
> > 
> > This patch fixes this by using a 64-bit unsigned int during DNF/CNF size
> > calculation which is enough to avoid overflow in the testcase, and we now
> > compile it in ~3 seconds.
> > 
> > In theory doubling the precision will only let us handle a ~2x bigger
> > constraint before risking overflow in the worst case given the
> > exponential-ness, but I suppose it should suffice for now.
> > 
> >     PR c++/118069
> > 
> > gcc/cp/ChangeLog:
> > 
> >     * logic.cc (dnf_size_r): Use unsigned HOST_WIDE_INT instead of int.
> >     (cnf_size_r): Likewise.
> >     (dnf_size): Likewise.
> >     (cnf_size): Likewise.
> > ---
> >   gcc/cp/logic.cc | 24 ++++++++++++------------
> >   1 file changed, 12 insertions(+), 12 deletions(-)
> > 
> > diff --git a/gcc/cp/logic.cc b/gcc/cp/logic.cc
> > index fab2c357dc4..e46ea0ebb78 100644
> > --- a/gcc/cp/logic.cc
> > +++ b/gcc/cp/logic.cc
> > @@ -349,7 +349,7 @@ atomic_p (tree t)
> >      distributing.  In general, a conjunction for which this flag is set
> >      is considered a disjunction for the purpose of counting.  */
> >   -static std::pair<int, bool>
> > +static std::pair<unsigned HOST_WIDE_INT, bool>
> >   dnf_size_r (tree t)
> >   {
> >     if (atomic_p (t))
> > @@ -360,9 +360,9 @@ dnf_size_r (tree t)
> >        the results.  */
> >     tree lhs = TREE_OPERAND (t, 0);
> >     tree rhs = TREE_OPERAND (t, 1);
> > -  std::pair<int, bool> p1 = dnf_size_r (lhs);
> > -  std::pair<int, bool> p2 = dnf_size_r (rhs);
> > -  int n1 = p1.first, n2 = p2.first;
> > +  auto p1 = dnf_size_r (lhs);
> > +  auto p2 = dnf_size_r (rhs);
> > +  unsigned HOST_WIDE_INT n1 = p1.first, n2 = p2.first;
> >     bool d1 = p1.second, d2 = p2.second;
> >       if (disjunction_p (t))
> > @@ -457,7 +457,7 @@ dnf_size_r (tree t)
> >      distributing.  In general, a disjunction for which this flag is set
> >      is considered a conjunction for the purpose of counting.  */
> >   -static std::pair<int, bool>
> > +static std::pair<unsigned HOST_WIDE_INT, bool>
> >   cnf_size_r (tree t)
> >   {
> >     if (atomic_p (t))
> > @@ -468,9 +468,9 @@ cnf_size_r (tree t)
> >        the results.  */
> >     tree lhs = TREE_OPERAND (t, 0);
> >     tree rhs = TREE_OPERAND (t, 1);
> > -  std::pair<int, bool> p1 = cnf_size_r (lhs);
> > -  std::pair<int, bool> p2 = cnf_size_r (rhs);
> > -  int n1 = p1.first, n2 = p2.first;
> > +  auto p1 = cnf_size_r (lhs);
> > +  auto p2 = cnf_size_r (rhs);
> > +  unsigned HOST_WIDE_INT n1 = p1.first, n2 = p2.first;
> >     bool d1 = p1.second, d2 = p2.second;
> >       if (disjunction_p (t))
> > @@ -560,10 +560,10 @@ cnf_size_r (tree t)
> >   /* Count the number conjunctive clauses that would be created
> >      when rewriting T to DNF.  */
> >   -static int
> > +static unsigned HOST_WIDE_INT
> >   dnf_size (tree t)
> >   {
> > -  std::pair<int, bool> result = dnf_size_r (t);
> > +  auto result = dnf_size_r (t);
> >     return result.first == 0 ? 1 : result.first;
> >   }
> >   @@ -571,10 +571,10 @@ dnf_size (tree t)
> >   /* Count the number disjunctive clauses that would be created
> >      when rewriting T to CNF.  */
> >   -static int
> > +static unsigned HOST_WIDE_INT
> >   cnf_size (tree t)
> >   {
> > -  std::pair<int, bool> result = cnf_size_r (t);
> > +  auto result = cnf_size_r (t);
> >     return result.first == 0 ? 1 : result.first;
> >   }
> >   
> 
> 

Reply via email to