On 9/27/2024 1:24 PM, Richard Biener wrote:
> On Mon, 23 Sep 2024, Artemiy Volkov wrote:
> 
>> Implement a match.pd transformation inverting the sign of X in
>> C1 - X cmp C2, where C1 and C2 are integer constants and X is
>> of a wrapping signed type, by observing that:
>>
>> (a) If cmp is == or !=, simply move X and C2 to opposite sides of
>> the comparison to arrive at X cmp C1 - C2.
>>
>> (b) If cmp is <:
>>      - C1 - X < C2 means that C1 - X spans the values of -INF,
>>        -INF + 1, ..., C2 - 1;
>>          - Therefore, X is one of C1 - -INF, C1 - (-INF + 1), ...,
>>        C1 - C2 + 1;
>>      - Subtracting (C1 + 1), X - (C1 + 1) is one of - (-INF) - 1,
>>            - (-INF) - 2, ..., -C2;
>>          - Using the fact that - (-INF) - 1 is +INF, derive that
>>            X - (C1 + 1) spans the values +INF, +INF - 1, ..., -C2;
>>          - Thus, the original expression can be simplified to
>>            X - (C1 + 1) > -C2 - 1.
>>
>> (c) Similarly, C1 - X <= C2 is equivalent to X - (C1 + 1) >= -C2 - 1.
>>
>> (d) The >= and > cases are negations of (b) and (c), respectively.
>>
>> (e) In all cases, the expression -C2 - 1 can be shortened to
>> bit_not (C2).
>>
>> This transformation allows to occasionally save load-immediate /
>> subtraction instructions, e.g. the following statement:
>>
>> 10 - (int)f() >= 20;
>>
>> now compiles to
>>
>> addi    a0,a0,-11
>> slti    a0,a0,-20
>>
>> instead of
>>
>> li      a5,10
>> sub     a0,a5,a0
>> slti    t0,a0,20
>> xori    a0,t0,1
>>
>> on 32-bit RISC-V when compiled with -fwrapv.
>>
>> Additional examples can be found in the newly added test file.  This
>> patch has been bootstrapped and regtested on aarch64, x86_64, and i386,
>> and additionally regtested on riscv32.
>>
>> gcc/ChangeLog:
>>
>>          PR tree-optimization/116024
>>          * match.pd: New transformation around integer comparison.
>>
>> gcc/testsuite/ChangeLog:
>>
>>          * gcc.dg/tree-ssa/pr116024-1-fwrapv.c: New test.
>>
>> Signed-off-by: Artemiy Volkov <arte...@synopsys.com>
>> ---
>>   gcc/match.pd                                  | 20 ++++-
>>   .../gcc.dg/tree-ssa/pr116024-1-fwrapv.c       | 73 +++++++++++++++++++
>>   2 files changed, 92 insertions(+), 1 deletion(-)
>>   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr116024-1-fwrapv.c
>>
>> diff --git a/gcc/match.pd b/gcc/match.pd
>> index d0489789527..bf3b4a2e3fe 100644
>> --- a/gcc/match.pd
>> +++ b/gcc/match.pd
>> @@ -8970,7 +8970,25 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>>       (cmp (plus @1 (minus @2 @0)) @2))
>>          (if (cmp == LT_EXPR || cmp == GE_EXPR)
>>       (cmp (plus @1 (minus @2
>> -               (plus @0 { build_one_cst (TREE_TYPE (@1)); }))) @2)))))))
>> +               (plus @0 { build_one_cst (TREE_TYPE (@1)); }))) @2)))
>> +/* For wrapping signed types (-fwrapv), transform like so (using < as 
>> example):
>> +     C1 - X < C2
>> +      ==>  C1 - X = { -INF, -INF + 1, ..., C2 - 1 }
>> +      ==>  X = { C1 - (-INF), C1 - (-INF + 1), ..., C1 - C2 + 1 }
>> +      ==>  X - (C1 + 1) = { - (-INF) - 1, - (-INF) - 2, ..., -C2 }
>> +      ==>  X - (C1 + 1) = { +INF, +INF - 1, ..., -C2 }
>> +      ==>  X - (C1 + 1) > -C2 - 1
>> +      ==>  X - (C1 + 1) > bit_not (C2)
>> +
>> +      Similarly,
>> +     C1 - X <= C2 ==> X - (C1 + 1) >= bit_not (C2);
>> +     C1 - X >= C2 ==> X - (C1 + 1) <= bit_not (C2);
>> +     C1 - X > C2 ==> X - (C1 + 1) < bit_not (C2).  */
>> +   (if (TYPE_OVERFLOW_WRAPS (TREE_TYPE (@1)))
> 
> You need to add && !TYPE_UNSIGNED (TREE_TYPE (@1)) here,
> TYPE_OVERFLOW_WRAPS is also true for unsigned types.

This pattern is written as a series of nested ifs, and this if is 
actually the else arm of the
     if (TYPE_UNSIGNED (TREE_TYPE (@1))
check on line 9078 (introduced in patch #2), so if we got here then 
TYPE_UNSIGNED is already necessarily false (and thus this code already 
only deals with signed wrapping types).  I don't think the extra check 
is necessary, but I can add it if it makes things clearer.  What do you 
think?

Thanks,
Artemiy

> 
> OK with that change.
> 
> Thanks,
> Richard.
> 
>> +     (if (cmp == EQ_EXPR || cmp == NE_EXPR)
>> +    (cmp @1 (minus @0 @2))
>> +     (rcmp (minus @1 (plus @0 { build_one_cst (TREE_TYPE (@1)); }))
>> +         (bit_not @2))))))))
>>   
>>   /* Canonicalizations of BIT_FIELD_REFs.  */
>>   
>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr116024-1-fwrapv.c 
>> b/gcc/testsuite/gcc.dg/tree-ssa/pr116024-1-fwrapv.c
>> new file mode 100644
>> index 00000000000..c2bf1d17234
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr116024-1-fwrapv.c
>> @@ -0,0 +1,73 @@
>> +/* PR tree-optimization/116024 */
>> +/* { dg-do compile } */
>> +/* { dg-options "-O1 -fdump-tree-forwprop1-details -fwrapv" } */
>> +
>> +#include <stdint.h>
>> +
>> +uint32_t f(void);
>> +
>> +int32_t i2(void)
>> +{
>> +  int32_t l = 2;
>> +  l = 10 - (int32_t)f();
>> +  return l <= 20; // f() - 11 >= -21
>> +}
>> +
>> +int32_t i2a(void)
>> +{
>> +  int32_t l = 2;
>> +  l = 10 - (int32_t)f();
>> +  return l < 30; // f() - 11 > -31
>> +}
>> +
>> +int32_t i2b(void)
>> +{
>> +  int32_t l = 2;
>> +  l = 200 - (int32_t)f();
>> +  return l <= 100; // f() - 201 >= -101
>> +}
>> +
>> +int32_t i2c(void)
>> +{
>> +  int32_t l = 2;
>> +  l = 300 - (int32_t)f();
>> +  return l < 100; // f() - 301 > -101
>> +}
>> +
>> +int32_t i2d(void)
>> +{
>> +  int32_t l = 2;
>> +  l = 1000 - (int32_t)f();
>> +  return l >= 2000; // f() - 1001 <= -2001
>> +}
>> +
>> +int32_t i2e(void)
>> +{
>> +  int32_t l = 2;
>> +  l = 1000 - (int32_t)f();
>> +  return l > 3000; // f() - 1001 < -3001
>> +}
>> +
>> +int32_t i2f(void)
>> +{
>> +  int32_t l = 2;
>> +  l = 20000 - (int32_t)f();
>> +  return l >= 10000; // f() - 20001 <= -10001
>> +}
>> +
>> +int32_t i2g(void)
>> +{
>> +  int32_t l = 2;
>> +  l = 30000 - (int32_t)f();
>> +  return l > 10000; // f() - 30001 < -10001
>> +}
>> +
>> +/* { dg-final { scan-tree-dump-times "Removing dead stmt:.*?- _" 8 
>> "forwprop1" } } */
>> +/* { dg-final { scan-tree-dump-times "gimple_simplified to.* \\+ 
>> -11.*\n.*>= -21" 1 "forwprop1" } } */
>> +/* { dg-final { scan-tree-dump-times "gimple_simplified to.* \\+ 
>> -11.*\n.*>= -30" 1 "forwprop1" } } */
>> +/* { dg-final { scan-tree-dump-times "gimple_simplified to.* \\+ 
>> -201.*\n.*>= -101" 1 "forwprop1" } } */
>> +/* { dg-final { scan-tree-dump-times "gimple_simplified to.* \\+ 
>> -301.*\n.*>= -100" 1 "forwprop1" } } */
>> +/* { dg-final { scan-tree-dump-times "gimple_simplified to.* \\+ 
>> -1001.*\n.*< -2000" 1 "forwprop1" } } */
>> +/* { dg-final { scan-tree-dump-times "gimple_simplified to.* \\+ 
>> -1001.*\n.*< -3001" 1 "forwprop1" } } */
>> +/* { dg-final { scan-tree-dump-times "gimple_simplified to.* \\+ 
>> -20001.*\n.*< -10000" 1 "forwprop1" } } */
>> +/* { dg-final { scan-tree-dump-times "gimple_simplified to.* \\+ 
>> -30001.*\n.*< -10001" 1 "forwprop1" } } */
>>
> 

Reply via email to