Thanks Jeff.

> But I would expect that may be beneficial on other targets as well.
I think x86 have the similar insn for saturation, for example as paddsw in 
below link.
https://www.felixcloutier.com/x86/paddsb:paddsw

And the backend of x86 implemented some of them already I bet, like usadd, 
ussub.

> The other question that I think Robin initially raised to me privately 
> is whether or not the sequences we're generating are well suited for 
> zicond or not.  

Got it, cmov like insn is well designed for such case(s). We can consider the 
best
practice to leverage zicond ext in further improvements.

Pan

-----Original Message-----
From: Jeff Law <jeffreya...@gmail.com> 
Sent: Monday, September 2, 2024 11:32 AM
To: Li, Pan2 <pan2...@intel.com>; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; rdapp....@gmail.com
Subject: Re: [PATCH v1] RISC-V: Support form 1 of integer scalar .SAT_ADD



On 9/1/24 8:50 PM, Li, Pan2 wrote:
> Thanks Jeff for comments.
> 
>> OK.  Presumably the code you're getting here is more efficient than
>> whatever standard expansion would provide?  If so, should we be looking
>> at moving some of this stuff into generic expanders?  I don't really see
>> anything all that target specific here.
> 
> Mostly for that we can eliminate the branch for .SAT_ADD in scalar. Given we
> don't have one SAT_ADD like insn like RVV vsadd.vv/vx/vi.
But I would expect that may be beneficial on other targets as well. 
It's not conceptually a lot different than what we do basic arithmetic 
with overflow, which has generic expansion which can be overridden by 
target specific expanders.  See expand_addsub_overflow.

Again, I think this is OK, but I'm thinking we probably want something 
more generic in the longer term.

The other question that I think Robin initially raised to me privately 
is whether or not the sequences we're generating are well suited for 
zicond or not.  If not, we might want to consider adjustments to either 
generate zicond if-then-else constructs during initial code generation 
or bias initial code generator towards sequences that ifcvt & combine 
can turn into zicond.  But again not strictly necessary for this patch 
to go forward, more a potential avenue for further improvements.


> 
> Pan
> 
> -----Original Message-----
> From: Jeff Law <jeffreya...@gmail.com>
> Sent: Sunday, September 1, 2024 11:35 PM
> To: Li, Pan2 <pan2...@intel.com>; gcc-patches@gcc.gnu.org
> Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; rdapp....@gmail.com
> Subject: Re: [PATCH v1] RISC-V: Support form 1 of integer scalar .SAT_ADD
> 
> 
> 
> On 8/29/24 12:25 AM, pan2...@intel.com wrote:
>> From: Pan Li <pan2...@intel.com>
>>
>> This patch would like to support the scalar signed ssadd pattern
>> for the RISC-V backend.  Aka
>>
>> Form 1:
>>     #define DEF_SAT_S_ADD_FMT_1(T, UT, MIN, MAX) \
>>     T __attribute__((noinline))                  \
>>     sat_s_add_##T##_fmt_1 (T x, T y)             \
>>     {                                            \
>>       T sum = (UT)x + (UT)y;                     \
>>       return (x ^ y) < 0                         \
>>         ? sum                                    \
>>         : (sum ^ x) >= 0                         \
>>           ? sum                                  \
>>           : x < 0 ? MIN : MAX;                   \
>>     }
>>
>> DEF_SAT_S_ADD_FMT_1(int64_t, uint64_t, INT64_MIN, INT64_MAX)
>>
>> Before this patch:
>>     10   │ sat_s_add_int64_t_fmt_1:
>>     11   │     mv   a5,a0
>>     12   │     add  a0,a0,a1
>>     13   │     xor  a1,a5,a1
>>     14   │     not  a1,a1
>>     15   │     xor  a4,a5,a0
>>     16   │     and  a1,a1,a4
>>     17   │     blt  a1,zero,.L5
>>     18   │     ret
>>     19   │ .L5:
>>     20   │     srai a5,a5,63
>>     21   │     li   a0,-1
>>     22   │     srli a0,a0,1
>>     23   │     xor  a0,a5,a0
>>     24   │     ret
>>
>> After this patch:
>>     10   │ sat_s_add_int64_t_fmt_1:
>>     11   │     add  a2,a0,a1
>>     12   │     xor  a1,a0,a1
>>     13   │     xor  a5,a0,a2
>>     14   │     srli a5,a5,63
>>     15   │     srli a1,a1,63
>>     16   │     xori a1,a1,1
>>     17   │     and  a5,a5,a1
>>     18   │     srai a4,a0,63
>>     19   │     li   a3,-1
>>     20   │     srli a3,a3,1
>>     21   │     xor  a3,a3,a4
>>     22   │     neg  a4,a5
>>     23   │     and  a3,a3,a4
>>     24   │     addi a5,a5,-1
>>     25   │     and  a0,a2,a5
>>     26   │     or   a0,a0,a3
>>     27   │     ret
>>
>> The below test suites are passed for this patch:
>> 1. The rv64gcv fully regression test.
>>
>> gcc/ChangeLog:
>>
>>      * config/riscv/riscv-protos.h (riscv_expand_ssadd): Add new func
>>      decl for expanding ssadd.
>>      * config/riscv/riscv.cc (riscv_gen_sign_max_cst): Add new func
>>      impl to gen the max int rtx.
>>      (riscv_expand_ssadd): Add new func impl to expand the ssadd.
>>      * config/riscv/riscv.md (ssadd<mode>3): Add new pattern for
>>      signed integer .SAT_ADD.
>>
>> gcc/testsuite/ChangeLog:
>>
>>      * gcc.target/riscv/sat_arith.h: Add test helper macros.
>>      * gcc.target/riscv/sat_arith_data.h: Add test data.
>>      * gcc.target/riscv/sat_s_add-1.c: New test.
>>      * gcc.target/riscv/sat_s_add-2.c: New test.
>>      * gcc.target/riscv/sat_s_add-3.c: New test.
>>      * gcc.target/riscv/sat_s_add-4.c: New test.
>>      * gcc.target/riscv/sat_s_add-run-1.c: New test.
>>      * gcc.target/riscv/sat_s_add-run-2.c: New test.
>>      * gcc.target/riscv/sat_s_add-run-3.c: New test.
>>      * gcc.target/riscv/sat_s_add-run-4.c: New test.
>>      * gcc.target/riscv/scalar_sat_binary_run_xxx.h: New test.
> OK.  Presumably the code you're getting here is more efficient than
> whatever standard expansion would provide?  If so, should we be looking
> at moving some of this stuff into generic expanders?  I don't really see
> anything all that target specific here.
> 
> jeff
> 

Reply via email to