On 9/17/24 5:25 AM, Jovan Vukic wrote:
The patch optimizes code generated for comparisons of the form x > y ? 2 : 3
(x <= y ? 3 : 2) and x < y ? 2 : 3 (x >= y ? 3 : 2).

For the following C code:

long f1(long x, long y) {
     return (x > y) ? 2 : 3;
}

long f2(long x, long y) {
     return (x < y) ? 2 : 3;
}


Before the patch, the generated assembly is:

f1(long, long):
         sgt     a0, a0, a1
         xori    a0, a0, 1
         addi    a0, a0, 2
         ret

f2(long, long):
         slt     a0, a0, a1
         xori    a0, a0, 1
         addi    a0, a0, 2
         ret


After the patch, the generated assembly is:

f1(long, long):
         slt     a0,a1,a0
         xori    a0,a0,3
         ret

f2(long, long):
         slt     a0,a0,a1
         xori    a0,a0,3
         ret


If we only consider the first example, during the combine pass, it can match
the "le:" pattern (x <= y), but that can only be replaced by the inverse
operation x > y (y < x), which requires replacing the following addi
instruction with an xori instruction as well, as proposed by the patch.

Tested on both RV32 and RV64 with no regressions.

2024-09-17  Jovan Vukic  <jovan.vu...@rt-rk.com>

         PR target/108038

gcc/ChangeLog:

         * config/riscv/iterators.md (paired_lt): New code attributes.
         * config/riscv/riscv.md (*sle<u>_<X:mode><GPR:mode>_xor): New pattern.
         (*sge<u>_<X:mode><GPR:mode>_xor): New pattern.

gcc/testsuite/ChangeLog:

         * gcc.target/riscv/slt-1.c: New test.
So you're essentially taking advantage of the fact that the bits set by the sCC and the constant in the addi are disjoint. That's a pretty generic concept. So there's a real possibility we could do this in generic code so that everyone could benefit.

The first point in the various pipelines would probably be phi-opt which tries to collapse certain if-then constructs into conditional expressions (tree-ssa-phiopt.cc).

It'll be presented with something like this:

;;   basic block 2, loop depth 0
;;    pred:       ENTRY
  if (x_2(D) > y_3(D))
    goto <bb 3>; [50.00%]
  else
    goto <bb 4>; [50.00%]
;;    succ:       3
;;                4

;;   basic block 3, loop depth 0
;;    pred:       2
;;    succ:       4

;;   basic block 4, loop depth 0
;;    pred:       2
;;                3
  # _1 = PHI <3(2), 2(3)>

Which seems well suited for value_replacement in tree-ssa-phiopt.cc except that value_replacement seems to want an equality comparison and SSA_NAMEs (rather than constants) as PHI args. But still, it's pretty close. Somewhere in tree-ssa-phiopt.cc seems like the best place to handle this.

An alternative would be in in simplify_binary_operation in the RTL pipeline. At some point it should be passed a useful PLUS as the outer operator and (xor (reg) (const_int 1)) as the first argument and the constant as the second argument.

*If* we ultimately decided to do this in the risc-v backend, the first thing I would suggest would be to try and use a define_split rather than a define_insn_and_split. The latter tends to throw off costing models for splitting, thus the former is slightly preferred.

And just to be 100% clear, I think this is a good patch, but that there's a better way to accomplish what you're trying to do.

jeff

Reply via email to