Re: [patch optimization]: Add some basic folding for ==/!= comparison

Richard Guenther Mon, 16 Apr 2012 09:59:52 -0700

On Mon, Apr 16, 2012 at 3:58 PM, Kai Tietz <ktiet...@googlemail.com> wrote:
> 2012/4/12 Richard Guenther <richard.guent...@gmail.com>:
>> On Thu, Apr 5, 2012 at 6:15 PM, Kai Tietz <ktiet...@googlemail.com> wrote:
>>> Hello,
>>>
>>> this patch adds some basic folding capabilities to fold-const for
>>> equal and none-equal comparisons
>>> on integer typed argument.
>>>
>>> ChangeLog
>>>
>>> 2012-04-05  Kai Tietz  <kti...@redhat.com>
>>>
>>>        * fold-const.c (fold_comparison_1): New
>>>        function.
>>>        (fold_comparison): Use fold_comparison_1.
>>>
>>> 2012-04-05  Kai Tietz  <kti...@redhat.com>
>>>
>>>        * gcc.dg/fold-compare-1.c: Adjust matching rule
>>>        for a == b without argument swap.
>>>        * gcc.dg/fold-compare-7.c: New test.
>>>
>>> Regession tested for x86_64-unknown-linux-gnu for all languages
>>> (including Ada and Obj-C++).  Ok for apply?
>>>
>>> Regards,
>>> Kai
>>>
>>> Index: gcc/gcc/fold-const.c
>>> ===================================================================
>>> --- gcc.orig/gcc/fold-const.c
>>> +++ gcc/gcc/fold-const.c
>>> @@ -8739,6 +8739,241 @@ pointer_may_wrap_p (tree base, tree offs
>>>   return total_low > (unsigned HOST_WIDE_INT) size;
>>>  }
>>>
>>> +/* Sub-routine of fold_comparison.  Folding of EQ_EXPR/NE_EXPR
>>> +   comparisons with integral typed arguments.  */
>>> +
>>> +static tree
>>> +fold_comparison_1 (location_t loc, enum tree_code code, tree type,
>>> +                  tree arg0, tree arg1)
>>
>> Please name it more specific, like for example
>> fold_integral_equality_comparison.
>>
>>> +{
>>> +  enum tree_code c1, c2;
>>> +  tree optype, op0, op1, opr0, opr1, tem;
>>> +
>>> +  if (code != NE_EXPR && code != EQ_EXPR)
>>> +    return NULL_TREE;
>>> +
>>> +  optype = TREE_TYPE (arg0);
>>> +  if (!INTEGRAL_TYPE_P (optype))
>>> +    return NULL_TREE;
>>> +
>>> +  c1 = TREE_CODE (arg0);
>>> +  c2 = TREE_CODE (arg1);
>>> +
>>> +  /* Integer constant is on right-hand side.  */
>>> +  if (c1 == INTEGER_CST
>>> +      && c2 != c1)
>>> +    return fold_build2_loc (loc, code, type, arg1, arg0);
>>
>>  /* If one arg is a real or integer constant, put it last.  */
>>  if (tree_swap_operands_p (arg0, arg1, true))
>>    return fold_build2_loc (loc, swap_tree_comparison (code), type, op1, op0);
>>
>> in fold_comparison already ensures this.
>>
>>> +  if (!TREE_SIDE_EFFECTS (arg0)
>>> +      && operand_equal_p (arg0, arg1, 0))
>>> +    {
>>> +      if (code == EQ_EXPR)
>>> +        return build_one_cst (type);
>>> +      return build_zero_cst (type);
>>> +    }
>>
>> This is already done in a more general way in fold_comparison:
>
> Yes, was a duplicate like ...
>
>>  /* Simplify comparison of something with itself.  (For IEEE
>>     floating-point, we can only do some of these simplifications.)  */
>>  if (operand_equal_p (arg0, arg1, 0))
>>    {
>>      switch (code)
>>        {
>>        case EQ_EXPR:
>> ...
>>
>> which also shows how to fold to true/false - using constant_boolean_node.
>
> like this one. So I removed from patch.
>
>>> +
>>> +  if (c1 == NEGATE_EXPR)
>>> +    {
>>> +      op0 = TREE_OPERAND (arg0, 0);
>>> +      /* -X ==/!= -Y -> X ==/!= Y.  */
>>> +      if (c2 == c1)
>>> +        return fold_build2_loc (loc, code, type,
>>> +                               op0,
>>> +                               TREE_OPERAND (arg1, 0));
>>
>> This is already done, in a more general way but only for float types,
>> in fold_comparison.  It's beyond me why it is conditional on float types
>> there and does not check for trapping math and NaNs (maybe that's
>> well-defined - one would need to double-check).  For integral types
>> you'd have to care for undefined overflow (or issue a warning), and ...
>
> You miss here explicit a point about ==/!= comparisons.  The negate
> can be removed for such comparisons uncoditionally, as there can't
> happen an overflow, which changes result of compare.  It would be even
> a flaw for checking here for those cases about overflow.


You miss the fact that the _negate_ can overflow.  Thus, with -ftrapv
you can end up removing a side-effect.

>>> +      /* -X ==/!= CST -> X ==/!= CST' with CST'= -CST.  */
>>> +      else if (c2 == INTEGER_CST)
>>> +        return fold_build2_loc (loc, code, type, op0,
>>> +                               fold_build1_loc (loc, NEGATE_EXPR,
>>> +                                                optype, arg1));
>>
>> ... generalizing this the code should use negate_expr_p / negate_expr
>> to for example handle folding -b != b - a to b != a - b (of course you'd
>> require at least one explicit NEGATE_EXPR - similar foldings elsewhere
>> will tell you what to do).
>
> See, above. No, it would be a failure to use negate_expr_p here, as
> the overflow simply doesn't matter and there is also no need to warn
> about it.

negate_expr_p is not about overflow but about "can we cheaply negate this?"
Thus, if you have -X == Y and negate_expr_p returns true for Y you can
fold it to X == negate_expr (Y).  No need to only handle integer constants.

>>> +     }
>>> +  else if (c1 == BIT_NOT_EXPR)
>>> +    {
>>> +      op0 = TREE_OPERAND (arg0, 0);
>>> +      /* ~X ==/!= ~Y -> X ==/!= Y.  */
>>> +      if (c2 == c1)
>>> +        return fold_build2_loc (loc, code, type, op0,
>>> +                               TREE_OPERAND (arg1, 0));
>>
>> This can be generalized to relational comparisons as well I think.  It's also
>> already done in fold_comparison here:
>
> No it isn't.  As again for ==/!= comparison the overflow simply
> doesn't matter.  Therefore I added this function to special-case
> (non-)equal-comparison.  The overflow cases are already handled for
> general comparison, no need to do it twice.
>
>>  /* Fold ~X op ~Y as Y op X.  */
>>  if (TREE_CODE (arg0) == BIT_NOT_EXPR
>>      && TREE_CODE (arg1) == BIT_NOT_EXPR)
>>    {

Where do you see the overflow checking here?

>>
>>> +      /* ~X ==/!= CST -> X ==/!= CST' with CST'= ~CST.  */
>>> +      else if (c2 == INTEGER_CST)
>>> +        return fold_build2_loc (loc, code, type, op0,
>>> +                               fold_build1_loc (loc, BIT_NOT_EXPR,
>>> +                                                optype, arg1));
>>
>> Possibly unified with having a new predicate/worker invert_expr_p / 
>> invert_expr.
>
> Well, there is no need for an invert_expr_p (see above).  Also in this
> case we don't need and have to warn.

See my comment about negate_expr_p.

>>> +    }
>>> +
>>> +  /* See if we can simplify case X == (Y +|-|^ Z).  */
>>> +  if (c1 != PLUS_EXPR && c1 != MINUS_EXPR && c1 != BIT_XOR_EXPR)
>>> +    {
>>> +      if ((c2 != PLUS_EXPR && c2 != MINUS_EXPR
>>> +           && c2 != BIT_XOR_EXPR)
>>> +          || TREE_SIDE_EFFECTS (arg0))
>>> +        return NULL_TREE;
>>
>> (I'm not sure why you are testing for side-effects - if you omit sth use
>> omit_*_operand ())
>
> Actual the use of omit_*_operand () introduces for none-volative cases
> NON_LVALUE_EXPR expressions, which are within comparisons vain.  Also
> it wasn't quite clear if the following reduction of volatiles within a
> comparison is valid.  At least for substractions we don't do this
> optimization, so I would assume that it would be wrong for
> comparisons, too.

omit_*_operand emits COMPOUND_EXPRs, not NON_LVALUE_EXPRs.

Richard.

>
>>> +
>>> +      op0 = TREE_OPERAND (arg1, 0);
>>> +      op1 = TREE_OPERAND (arg1, 1);
>>
>> Please use names like arg10 and arg11 as elsewhere in folding.
>>
>>> +      /* Convert temporary X - Y to X + (-Y).  */
>>> +      if (c2 == MINUS_EXPR)
>>> +        op1 = fold_build1_loc (loc, NEGATE_EXPR, optype, op1);
>>
>> That's not a good idea - in general we avoid building scratch trees
>> during folding.
>
> Well, this patterns can be of course written out explicit.  But by
> doing this transformation simplifies later on used patterns a bit.
> Also it can be later on used for futher optimizations about value
> preditions for patterns like 'a == 1 - a' being always false for all
> integer values, etc.
>
>>> +
>>> +      /* Check if we can simplify X ==/!= (X ^ Y) to Y ==/!= 0,
>>> +         or X ==/!= (X +|- Y) to Y ==/!= 0.  */
>>> +      tem = fold_build2_loc (loc, (c2 == BIT_XOR_EXPR ? c2 : MINUS_EXPR),
>>> +                            optype, arg0, op0);
>>
>> Similar - also this code and the code below duplicates things four times.
>> That's both expensive and hard to read.  It asks for some factorization
>> and use of explicit pattern matching instead of recursing into folding.
>
> Not quite sure what you mean here by recursion.  We have actual here
> three cases (with op being PLUS/MINUX or XOR expression):
>
> 1) A !=/== B (which is already convered in general comparison-code)
>
> 2) A !=/== (B op C) or (A op B) !=/== C (this I duplicated in code,
> but could be of course factored out into a helper-routine)
>
> and
>
> 3 (A op B) !=/== (C op D)
>
>>> +      if (TREE_CODE (tem) == INTEGER_CST
>>> +         && (integer_zerop (tem) || TYPE_UNSIGNED (optype)
>>> +             || c2 == BIT_XOR_EXPR))
>>> +       return fold_build2_loc (loc, code, type, op1, tem);
>>> +
>>> +      /* Check if we can simplify X ==/!= (Y ^ X) to Y ==/!= 0,
>>> +         or X ==/!= (Y + X) to Y ==/!= 0.  */
>>> +      tem = fold_build2_loc (loc, (c2 == BIT_XOR_EXPR ? c2 : MINUS_EXPR),
>>> +                            optype, arg0, op1);
>>> +      if (TREE_CODE (tem) == INTEGER_CST
>>> +         && (integer_zerop (tem) || TYPE_UNSIGNED (optype)
>>> +             || c2 == BIT_XOR_EXPR))
>>> +       return fold_build2_loc (loc, code, type, op0, tem);
>>> +    }
>>> +  else if (c2 != PLUS_EXPR && c2 != MINUS_EXPR && c2 != BIT_XOR_EXPR)
>>> +    {
>>> +      if ((c1 != PLUS_EXPR && c1 != MINUS_EXPR
>>> +           && c1 != BIT_XOR_EXPR)
>>> +          || TREE_SIDE_EFFECTS (arg1))
>>> +        return NULL_TREE;
>>> +
>>> +      op0 = TREE_OPERAND (arg0, 0);
>>> +      op1 = TREE_OPERAND (arg0, 1);
>>> +
>>> +      /* Convert temporary X - Y to X + (-Y).  */
>>> +      if (c1 == MINUS_EXPR)
>>> +        op1 = fold_build1_loc (loc, NEGATE_EXPR, optype, op1);
>>> +
>>> +      /* Check if we can simplify X ==/!= (X ^ Y) to Y ==/!= 0,
>>> +         or X ==/!= (X +|- Y) to Y ==/!= 0.  */
>>> +      tem = fold_build2_loc (loc, (c1 == BIT_XOR_EXPR ? c1 : MINUS_EXPR),
>>> +                            optype, arg1, op0);
>>> +      if (TREE_CODE (tem) == INTEGER_CST
>>> +         && (integer_zerop (tem) || TYPE_UNSIGNED (optype)
>>> +             || c1 == BIT_XOR_EXPR))
>>> +       return fold_build2_loc (loc, code, type, op1, tem);
>>> +
>>> +      /* Check if we can simplify X ==/!= (Y ^ X) to Y ==/!= 0,
>>> +         or X ==/!= (Y + X) to Y ==/!= 0.  */
>>> +      tem = fold_build2_loc (loc, (c1 == BIT_XOR_EXPR ? c1 : MINUS_EXPR),
>>> +                            optype, arg1, op1);
>>> +      if (TREE_CODE (tem) == INTEGER_CST
>>> +         && (integer_zerop (tem) || TYPE_UNSIGNED (optype)
>>> +             || c1 == BIT_XOR_EXPR))
>>> +       return fold_build2_loc (loc, code, type, op0, tem);
>>> +    }
>>> +
>>> +  /* We check if arg1 and arg2 are matching one of the patterns
>>> +     (V + W) ==/!= (X + Y), (V + W) ==/!= (X - Y), (V - W) ==/!= (X + Y),
>>> +     (V - W) ==/!= (X - Y), or (V ^ W) ==/!= (X ^ Y).  */
>>
>> I stopped reading here.  Please try to double check what we already do,
>> don't produce new code for everything you can think of.  This patch could
>> have needed splitting, too.
>>
>> Richard.
>>
>>> +  if ((c1 != PLUS_EXPR && c1 != MINUS_EXPR && c1 != BIT_XOR_EXPR)
>>> +      || (c2 != PLUS_EXPR && c2 != MINUS_EXPR && c2 != BIT_XOR_EXPR))
>>> +    return NULL_TREE;
>>> +  if (c1 != c2 && (c1 == BIT_XOR_EXPR || c2 == BIT_XOR_EXPR))
>>> +    return NULL_TREE;
>>> +
>>> +  op0 = TREE_OPERAND (arg0, 0);
>>> +  op1 = TREE_OPERAND (arg0, 1);
>>> +  opr0 = TREE_OPERAND (arg1, 0);
>>> +  opr1 = TREE_OPERAND (arg1, 1);
>>> +
>>> +  /* Convert temporary (X - Y) to (X + (-Y)).  */
>>> +  if (c1 == MINUS_EXPR)
>>> +    {
>>> +      op1 = fold_build1_loc (loc, NEGATE_EXPR, optype, op1);
>>> +      c1 = PLUS_EXPR;
>>> +    }
>>> +
>>> +  /* Convert temporary (X - Y) to (X + (-Y)).  */
>>> +  if (c2 == MINUS_EXPR)
>>> +    {
>>> +      opr1 = fold_build1_loc (loc, NEGATE_EXPR, optype, opr1);
>>> +      c2 = PLUS_EXPR;
>>> +    }
>>> +
>>> +  if (c1 != c2)
>>> +    return NULL_TREE;
>>> +
>>> +  /* If OP0 has no side-effects, we might be able to optimize
>>> +     (OP0 + OP1) ==/!= (OP0 + OPR1) to OP1 ==/!= OPR1,
>>> +     (OP0 + OP1) ==/!= (OPR0 + OP0) to OP1 ==/!= OPR0,
>>> +     (OP0 ^ OP1) ==/!= (OP0 ^ OPR1) to OP1 ==/!= OPR1,
>>> +     or (OP0 ^ OP1) ==/!= (OPR0 ^ OP0) to OP1 ==/!= OPR0..  */
>>> +  if (!TREE_SIDE_EFFECTS (op0))
>>> +    {
>>> +      tem = fold_build2_loc (loc, (c1 == PLUS_EXPR ? MINUS_EXPR : c1),
>>> +                            optype, op0, opr0);
>>> +      if (TREE_CODE (tem) == INTEGER_CST
>>> +         && !TREE_SIDE_EFFECTS (opr0)
>>> +         && (integer_zerop (tem) || TYPE_UNSIGNED (optype)
>>> +             || c1 == BIT_XOR_EXPR))
>>> +       {
>>> +         if (!integer_zerop (tem))
>>> +           tem = fold_build2_loc (loc, c1, optype, op1, tem);
>>> +         else
>>> +           tem = op1;
>>> +
>>> +         return fold_build2_loc (loc, code, type, tem, opr1);
>>> +       }
>>> +      tem = fold_build2_loc (loc, (c1 == PLUS_EXPR ? MINUS_EXPR : c1),
>>> +                            optype, op0, opr1);
>>> +      if (TREE_CODE (tem) == INTEGER_CST
>>> +         && !TREE_SIDE_EFFECTS (opr1)
>>> +         && (integer_zerop (tem) || TYPE_UNSIGNED (optype)
>>> +             || c1 == BIT_XOR_EXPR))
>>> +       {
>>> +         if (!integer_zerop (tem))
>>> +           tem = fold_build2_loc (loc, c1, optype, op1, tem);
>>> +         else
>>> +           tem = op1;
>>> +          return fold_build2_loc (loc, code, type, tem, opr0);
>>> +       }
>>> +    }
>>> +
>>> +  /* If OP1 has no side-effects, we might be able to optimize
>>> +     (OP0 + OP1) ==/!= (OP1 + OPR1) to OP0 ==/!= OPR1,
>>> +     (OP0 + OP1) ==/!= (OPR0 + OP1) to OP0 ==/!= OPR0,
>>> +     (OP0 ^ OP1) ==/!= (OP1 ^ OPR1) to OP0 ==/!= OPR1,
>>> +     or (OP0 ^ OP1) ==/!= (OPR0 ^ OP1) to OP0 ==/!= OPR0..  */
>>> +  if (!TREE_SIDE_EFFECTS (op1))
>>> +    {
>>> +      tem = fold_build2_loc (loc, (c1 == PLUS_EXPR ? MINUS_EXPR : c1),
>>> +                            optype, op1, opr0);
>>> +      if (TREE_CODE (tem) == INTEGER_CST
>>> +         && !TREE_SIDE_EFFECTS (opr0)
>>> +         && (integer_zerop (tem) || TYPE_UNSIGNED (optype)
>>> +             || c1 == BIT_XOR_EXPR))
>>> +       {
>>> +         if (!integer_zerop (tem))
>>> +           tem = fold_build2_loc (loc, c1, optype, op0, tem);
>>> +         else
>>> +           tem = op0;
>>> +         return fold_build2_loc (loc, code, type, tem, opr1);
>>> +       }
>>> +
>>> +      tem = fold_build2_loc (loc, (c1 == PLUS_EXPR ? MINUS_EXPR : c1),
>>> +                            optype, op1, opr1);
>>> +      if (TREE_CODE (tem) == INTEGER_CST
>>> +         && !TREE_SIDE_EFFECTS (opr1)
>>> +         && (integer_zerop (tem) || TYPE_UNSIGNED (optype)
>>> +             || c1 == BIT_XOR_EXPR))
>>> +       {
>>> +         if (!integer_zerop (tem))
>>> +           tem = fold_build2_loc (loc, c1, optype, op0, tem);
>>> +         else
>>> +           tem = op0;
>>> +         return fold_build2_loc (loc, code, type, tem, opr0);
>>> +       }
>>> +    }
>>> +
>>> +  return NULL_TREE;
>>> +}
>>> +
>>>  /* Subroutine of fold_binary.  This routine performs all of the
>>>    transformations that are common to the equality/inequality
>>>    operators (EQ_EXPR and NE_EXPR) and the ordering operators
>>> @@ -8767,6 +9002,10 @@ fold_comparison (location_t loc, enum tr
>>>   if (tree_swap_operands_p (arg0, arg1, true))
>>>     return fold_build2_loc (loc, swap_tree_comparison (code), type, op1, 
>>> op0);
>>>
>>> +  tem = fold_comparison_1 (loc, code, type, arg0, arg1);
>>> +  if (tem != NULL_TREE)
>>> +    return tem;
>>> +
>>>   /* Transform comparisons of the form X +- C1 CMP C2 to X CMP C2 +- C1.  */
>>>   if ((TREE_CODE (arg0) == PLUS_EXPR || TREE_CODE (arg0) == MINUS_EXPR)
>>>       && (TREE_CODE (TREE_OPERAND (arg0, 1)) == INTEGER_CST
>>> Index: gcc/gcc/testsuite/gcc.dg/fold-compare-1.c
>>> ===================================================================
>>> --- gcc.orig/gcc/testsuite/gcc.dg/fold-compare-1.c
>>> +++ gcc/gcc/testsuite/gcc.dg/fold-compare-1.c
>>> @@ -41,7 +41,7 @@ int test8(int l)
>>>   return ~l >= 2;
>>>  }
>>>
>>> -/* { dg-final { scan-tree-dump-times "b == a" 1 "original" } } */
>>> +/* { dg-final { scan-tree-dump-times "b == a|a == b" 1 "original" } } */
>>>  /* { dg-final { scan-tree-dump-times "c == d" 1 "original" } } */
>>>  /* { dg-final { scan-tree-dump-times "e == -5" 1 "original" } } */
>>>  /* { dg-final { scan-tree-dump-times "f == -6" 1 "original" } } */
>>> Index: gcc/gcc/testsuite/gcc.dg/fold-compare-7.c
>>> ===================================================================
>>> --- /dev/null
>>> +++ gcc/gcc/testsuite/gcc.dg/fold-compare-7.c
>>> @@ -0,0 +1,36 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-O2 -fdump-tree-original" } */
>>> +
>>> +int test1(int a, int elim)
>>> +{
>>> +  return ~elim == (elim ^ a);
>>> +}
>>> +
>>> +int test2(int elim, int b)
>>> +{
>>> +  return -elim == (b - elim);
>>> +}
>>> +
>>> +int test3(int c, int elim, int d)
>>> +{
>>> +  return (c + elim) != (elim + d);
>>> +}
>>> +
>>> +int test4(int e, int f, int elim)
>>> +{
>>> +  return (e - elim) != (-elim + f);
>>> +}
>>> +
>>> +int test5(int g, int h, int elim)
>>> +{
>>> +  return (elim ^ g) == (h ^ elim);
>>> +}
>>> +
>>> +int test6(int i, int j, int elim)
>>> +{
>>> +  return (elim ^ i) == (j ^ ~elim);
>>> +}
>>> +
>>> +/* { dg-final { scan-tree-dump-times "elim" 0 "original" } } */
>>> +/* { dg-final { cleanup-tree-dump "original" } } */
>>> +
>
>
> Regards,
> Kai

Re: [patch optimization]: Add some basic folding for ==/!= comparison

Reply via email to