Based on the valuable feedback I received, I decided to implement the patch
in the RTL pipeline. Since a similar optimization already exists in
simplify_binary_operation_1, I chose to generalize my original approach
and place it directly below that code.

The expression (X xor C1) + C2 is simplified to X xor (C1 xor C2) under
the conditions described in the patch. This is a more general optimization,
but it still applies to the RISC-V case, which was my initial goal:

long f1(long x, long y) {
    return (x > y) ? 2 : 3;
}


Before the patch, the generated assembly is:

f1(long, long):
        sgt     a0,a0,a1
        xori    a0,a0,1
        addi    a0,a0,2
        ret

After the patch, the generated assembly is:

f1(long, long):
        sgt     a0,a0,a1
        xori    a0,a0,3
        ret


The patch optimizes cases like x LT/GT y ? 2 : 3 (and x GE/LE y ? 3 : 2),
as initially intended. Since this optimization is more general, I noticed
it also optimizes cases like x < CONST ? 3 : 2 when CONST < 0. I’ve added
tests for these cases as well.

A bit of logic behind the patch: The equality A + B == A ^ B + 2 * (A & B)
always holds true. This can be simplified to A ^ B if 2 * (A & B) == 0.
In our case, we have A == X ^ C1, B == C2 and X is either 0 or 1.

2024-09-27  Jovan Vukic  <jovan.vu...@rt-rk.com>

        PR target/108038

gcc/ChangeLog:

        * simplify-rtx.cc (simplify_context::simplify_binary_operation_1): New
        simplification.

gcc/testsuite/ChangeLog:

        * gcc.target/riscv/slt-1.c: New test.
CONFIDENTIALITY: The contents of this e-mail are confidential and intended only 
for the above addressee(s). If you are not the intended recipient, or the 
person responsible for delivering it to the intended recipient, copying or 
delivering it to anyone else or using it in any unauthorized manner is 
prohibited and may be unlawful. If you receive this e-mail by mistake, please 
notify the sender and the systems administrator at straym...@rt-rk.com 
immediately.
---
 gcc/simplify-rtx.cc                    | 12 ++++++
 gcc/testsuite/gcc.target/riscv/slt-1.c | 59 ++++++++++++++++++++++++++
 2 files changed, 71 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/slt-1.c

diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc
index a20a61c5ddd..e8e60404ef6 100644
--- a/gcc/simplify-rtx.cc
+++ b/gcc/simplify-rtx.cc
@@ -2994,6 +2994,18 @@ simplify_context::simplify_binary_operation_1 (rtx_code 
code,
                                    simplify_gen_binary (XOR, mode, op1,
                                                         XEXP (op0, 1)));
 
+      /* (plus (xor X C1) C2) is (xor X (C1^C2)) if X is either 0 or 1 and
+        2 * ((X ^ C1) & C2) == 0; based on A + B == A ^ B + 2 * (A & B). */
+      if (CONST_SCALAR_INT_P (op1)
+         && GET_CODE (op0) == XOR
+         && CONST_SCALAR_INT_P (XEXP (op0, 1))
+         && nonzero_bits (XEXP (op0, 0), mode) == 1
+         && 2 * (INTVAL (XEXP (op0, 1)) & INTVAL (op1)) == 0
+         && 2 * ((1 ^ INTVAL (XEXP (op0, 1))) & INTVAL (op1)) == 0)
+       return simplify_gen_binary (XOR, mode, XEXP (op0, 0),
+                                   simplify_gen_binary (XOR, mode, op1,
+                                                        XEXP (op0, 1)));
+
       /* Canonicalize (plus (mult (neg B) C) A) to (minus A (mult B C)).  */
       if (!HONOR_SIGN_DEPENDENT_ROUNDING (mode)
          && GET_CODE (op0) == MULT
diff --git a/gcc/testsuite/gcc.target/riscv/slt-1.c 
b/gcc/testsuite/gcc.target/riscv/slt-1.c
new file mode 100644
index 00000000000..29a64066081
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/slt-1.c
@@ -0,0 +1,59 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc -mabi=lp64d" } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
+
+#include <stdint.h>
+
+#define COMPARISON(TYPE, OP, OPN, RESULT_TRUE, RESULT_FALSE) \
+    TYPE test_##OPN(TYPE x, TYPE y) { \
+        return (x OP y) ? RESULT_TRUE : RESULT_FALSE; \
+    }
+
+/* Signed comparisons */
+COMPARISON(int64_t, >, GT1, 2, 3)
+COMPARISON(int64_t, >, GT2, 5, 6)
+
+COMPARISON(int64_t, <, LT1, 2, 3)
+COMPARISON(int64_t, <, LT2, 5, 6)
+
+COMPARISON(int64_t, >=, GE1, 3, 2)
+COMPARISON(int64_t, >=, GE2, 6, 5)
+
+COMPARISON(int64_t, <=, LE1, 3, 2)
+COMPARISON(int64_t, <=, LE2, 6, 5)
+
+/* Unsigned comparisons */
+COMPARISON(uint64_t, >, GTU1, 2, 3)
+COMPARISON(uint64_t, >, GTU2, 5, 6)
+
+COMPARISON(uint64_t, <, LTU1, 2, 3)
+COMPARISON(uint64_t, <, LTU2, 5, 6)
+
+COMPARISON(uint64_t, >=, GEU1, 3, 2)
+COMPARISON(uint64_t, >=, GEU2, 6, 5)
+
+COMPARISON(uint64_t, <=, LEU1, 3, 2)
+COMPARISON(uint64_t, <=, LEU2, 6, 5)
+
+#define COMPARISON_IMM(TYPE, OP, OPN, RESULT_TRUE, RESULT_FALSE) \
+    TYPE testIMM_##OPN(TYPE x) { \
+        return (x OP -3) ? RESULT_TRUE : RESULT_FALSE; \
+    }
+
+/* Signed comparisons with immediate */
+COMPARISON_IMM(int64_t, >, GT1, 3, 2)
+
+COMPARISON_IMM(int64_t, <, LT1, 2, 3)
+
+COMPARISON_IMM(int64_t, >=, GE1, 3, 2)
+
+COMPARISON_IMM(int64_t, <=, LE1, 2, 3)
+
+/* { dg-final { scan-assembler-times "sgt\\t" 4 } } */
+/* { dg-final { scan-assembler-times "sgtu\\t" 4 } } */
+/* { dg-final { scan-assembler-times "slt\\t" 4 } } */
+/* { dg-final { scan-assembler-times "sltu\\t" 4 } } */
+/* { dg-final { scan-assembler-times "slti\\t" 4 } } */
+/* { dg-final { scan-assembler-times "xori\\ta0,a0,1" 8 } } */
+/* { dg-final { scan-assembler-times "xori\\ta0,a0,3" 12 } } */
+
-- 
2.43.0

Reply via email to