Committed, thanks Jeff.

Pan
-----Original Message-----
From: Gcc-patches <gcc-patches-bounces+pan2.li=intel....@gcc.gnu.org> On Behalf 
Of Jeff Law via Gcc-patches
Sent: Sunday, June 4, 2023 9:55 PM
To: juzhe.zh...@rivai.ai; gcc-patches@gcc.gnu.org
Cc: kito.ch...@sifive.com; pal...@rivosinc.com; rdapp....@gmail.com
Subject: Re: [PATCH] RISC-V: Remove redundant vlmul_ext_* patterns to fix 
PR110109



On 6/4/23 02:51, juzhe.zh...@rivai.ai wrote:
> From: Juzhe-Zhong <juzhe.zh...@rivai.ai>
> 
>          PR target/110109
> 
> This patch is to fix PR110109 issue. This issue happens is because:
> 
> (define_insn_and_split "*vlmul_extx2<mode>"
>    [(set (match_operand:<VLMULX2> 0 "register_operand"  "=vr, ?&vr")
>         (subreg:<VLMULX2>
>           (match_operand:VLMULEXT2 1 "register_operand" " 0,   vr") 0))]
>    "TARGET_VECTOR"
>    "#"
>    "&& reload_completed"
>    [(const_int 0)]
> {
>    emit_insn (gen_rtx_SET (gen_lowpart (<MODE>mode, operands[0]), 
> operands[1]));
>    DONE;
> })
So anytime you find yourself with an explicit subreg in a pattern, there's a 
very reasonable chance you've made a mistake somewhere else.

As a result every time I see an explicit subreg in a pattern I ask the author 
to describe in a fair amount of detail why the subreg was needed.

 From a first glance, they definitely look like you're papering over a 
problem elsewhere.    These are just simple moves.  For scalar modes 
this would be clearly wrong, but I'm not sure we have the same restrictions on 
vector moves.

I would also caution against the way you're generating code here.  I'd have to 
sit down with it for a while, but I'm not 100% sure you can just change the 
location of the subreg like you did (it's going to move from wrapping operand1 
to wrapping operand0).  The semantics may be subtly different -- and that's one 
of the other reasons to avoid explicit subregs.  It's easy to get the semantics 
wrong.


> 
> I create that patterns is to optimize the following test:
> vfloat32m2_t test_vlmul_ext_v_f32mf2_f32m2(vfloat32mf2_t op1) {
>    return __riscv_vlmul_ext_v_f32mf2_f32m2(op1);
> }
> 
> codegen:
> test_vlmul_ext_v_f32mf2_f32m2:
>          vsetvli a5,zero,e32,m2,ta,ma
>          vmv.v.i v2,0
>          vsetvli a5,zero,e32,mf2,ta,ma
>          vle32.v v2,0(a1)
>          vs2r.v  v2,0(a0)
>          ret
> 
> There is a redundant 'vmv.v.i' here, Since GCC doesn't undefine IR (unlike 
> LLVM, LLVM has undef/poison).
> For vlmul_ext_* RVV intrinsic, GCC will initiate all zeros into 
> register. However, I think it's not a big issue after we support subreg 
> livness tracking.
As I've suggested elsewhere, let's get the code correct and reasonably complete 
before we worry about this class of problems.  I'm not even convinced it's a 
big issue right now.



> 
> gcc/ChangeLog:
> 
>          * config/riscv/riscv-vector-builtins-bases.cc: Change expand 
> approach.
>          * config/riscv/vector.md (@vlmul_extx2<mode>): Remove it.
>          (@vlmul_extx4<mode>): Ditto.
>          (@vlmul_extx8<mode>): Ditto.
>          (@vlmul_extx16<mode>): Ditto.
>          (@vlmul_extx32<mode>): Ditto.
>          (@vlmul_extx64<mode>): Ditto.
>          (*vlmul_extx2<mode>): Ditto.
>          (*vlmul_extx4<mode>): Ditto.
>          (*vlmul_extx8<mode>): Ditto.
>          (*vlmul_extx16<mode>): Ditto.
>          (*vlmul_extx32<mode>): Ditto.
>          (*vlmul_extx64<mode>): Ditto.
> 
> gcc/testsuite/ChangeLog:
> 
>          * gcc.target/riscv/rvv/base/pr110109-1.c: New test.
>          * gcc.target/riscv/rvv/base/pr110109-2.c: New test.
Approved.  Please commit.

Jeff

Reply via email to