Thanks Jeff for comments. > OK. Presumably the code you're getting here is more efficient than > whatever standard expansion would provide? If so, should we be looking > at moving some of this stuff into generic expanders? I don't really see > anything all that target specific here.
Mostly for that we can eliminate the branch for .SAT_ADD in scalar. Given we don't have one SAT_ADD like insn like RVV vsadd.vv/vx/vi. Pan -----Original Message----- From: Jeff Law <jeffreya...@gmail.com> Sent: Sunday, September 1, 2024 11:35 PM To: Li, Pan2 <pan2...@intel.com>; gcc-patches@gcc.gnu.org Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; rdapp....@gmail.com Subject: Re: [PATCH v1] RISC-V: Support form 1 of integer scalar .SAT_ADD On 8/29/24 12:25 AM, pan2...@intel.com wrote: > From: Pan Li <pan2...@intel.com> > > This patch would like to support the scalar signed ssadd pattern > for the RISC-V backend. Aka > > Form 1: > #define DEF_SAT_S_ADD_FMT_1(T, UT, MIN, MAX) \ > T __attribute__((noinline)) \ > sat_s_add_##T##_fmt_1 (T x, T y) \ > { \ > T sum = (UT)x + (UT)y; \ > return (x ^ y) < 0 \ > ? sum \ > : (sum ^ x) >= 0 \ > ? sum \ > : x < 0 ? MIN : MAX; \ > } > > DEF_SAT_S_ADD_FMT_1(int64_t, uint64_t, INT64_MIN, INT64_MAX) > > Before this patch: > 10 │ sat_s_add_int64_t_fmt_1: > 11 │ mv a5,a0 > 12 │ add a0,a0,a1 > 13 │ xor a1,a5,a1 > 14 │ not a1,a1 > 15 │ xor a4,a5,a0 > 16 │ and a1,a1,a4 > 17 │ blt a1,zero,.L5 > 18 │ ret > 19 │ .L5: > 20 │ srai a5,a5,63 > 21 │ li a0,-1 > 22 │ srli a0,a0,1 > 23 │ xor a0,a5,a0 > 24 │ ret > > After this patch: > 10 │ sat_s_add_int64_t_fmt_1: > 11 │ add a2,a0,a1 > 12 │ xor a1,a0,a1 > 13 │ xor a5,a0,a2 > 14 │ srli a5,a5,63 > 15 │ srli a1,a1,63 > 16 │ xori a1,a1,1 > 17 │ and a5,a5,a1 > 18 │ srai a4,a0,63 > 19 │ li a3,-1 > 20 │ srli a3,a3,1 > 21 │ xor a3,a3,a4 > 22 │ neg a4,a5 > 23 │ and a3,a3,a4 > 24 │ addi a5,a5,-1 > 25 │ and a0,a2,a5 > 26 │ or a0,a0,a3 > 27 │ ret > > The below test suites are passed for this patch: > 1. The rv64gcv fully regression test. > > gcc/ChangeLog: > > * config/riscv/riscv-protos.h (riscv_expand_ssadd): Add new func > decl for expanding ssadd. > * config/riscv/riscv.cc (riscv_gen_sign_max_cst): Add new func > impl to gen the max int rtx. > (riscv_expand_ssadd): Add new func impl to expand the ssadd. > * config/riscv/riscv.md (ssadd<mode>3): Add new pattern for > signed integer .SAT_ADD. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/sat_arith.h: Add test helper macros. > * gcc.target/riscv/sat_arith_data.h: Add test data. > * gcc.target/riscv/sat_s_add-1.c: New test. > * gcc.target/riscv/sat_s_add-2.c: New test. > * gcc.target/riscv/sat_s_add-3.c: New test. > * gcc.target/riscv/sat_s_add-4.c: New test. > * gcc.target/riscv/sat_s_add-run-1.c: New test. > * gcc.target/riscv/sat_s_add-run-2.c: New test. > * gcc.target/riscv/sat_s_add-run-3.c: New test. > * gcc.target/riscv/sat_s_add-run-4.c: New test. > * gcc.target/riscv/scalar_sat_binary_run_xxx.h: New test. OK. Presumably the code you're getting here is more efficient than whatever standard expansion would provide? If so, should we be looking at moving some of this stuff into generic expanders? I don't really see anything all that target specific here. jeff