Committed, thanks Jeff. Pan
-----Original Message----- From: Gcc-patches <gcc-patches-bounces+pan2.li=intel....@gcc.gnu.org> On Behalf Of Jeff Law via Gcc-patches Sent: Sunday, June 4, 2023 9:55 PM To: juzhe.zh...@rivai.ai; gcc-patches@gcc.gnu.org Cc: kito.ch...@sifive.com; pal...@rivosinc.com; rdapp....@gmail.com Subject: Re: [PATCH] RISC-V: Remove redundant vlmul_ext_* patterns to fix PR110109 On 6/4/23 02:51, juzhe.zh...@rivai.ai wrote: > From: Juzhe-Zhong <juzhe.zh...@rivai.ai> > > PR target/110109 > > This patch is to fix PR110109 issue. This issue happens is because: > > (define_insn_and_split "*vlmul_extx2<mode>" > [(set (match_operand:<VLMULX2> 0 "register_operand" "=vr, ?&vr") > (subreg:<VLMULX2> > (match_operand:VLMULEXT2 1 "register_operand" " 0, vr") 0))] > "TARGET_VECTOR" > "#" > "&& reload_completed" > [(const_int 0)] > { > emit_insn (gen_rtx_SET (gen_lowpart (<MODE>mode, operands[0]), > operands[1])); > DONE; > }) So anytime you find yourself with an explicit subreg in a pattern, there's a very reasonable chance you've made a mistake somewhere else. As a result every time I see an explicit subreg in a pattern I ask the author to describe in a fair amount of detail why the subreg was needed. From a first glance, they definitely look like you're papering over a problem elsewhere. These are just simple moves. For scalar modes this would be clearly wrong, but I'm not sure we have the same restrictions on vector moves. I would also caution against the way you're generating code here. I'd have to sit down with it for a while, but I'm not 100% sure you can just change the location of the subreg like you did (it's going to move from wrapping operand1 to wrapping operand0). The semantics may be subtly different -- and that's one of the other reasons to avoid explicit subregs. It's easy to get the semantics wrong. > > I create that patterns is to optimize the following test: > vfloat32m2_t test_vlmul_ext_v_f32mf2_f32m2(vfloat32mf2_t op1) { > return __riscv_vlmul_ext_v_f32mf2_f32m2(op1); > } > > codegen: > test_vlmul_ext_v_f32mf2_f32m2: > vsetvli a5,zero,e32,m2,ta,ma > vmv.v.i v2,0 > vsetvli a5,zero,e32,mf2,ta,ma > vle32.v v2,0(a1) > vs2r.v v2,0(a0) > ret > > There is a redundant 'vmv.v.i' here, Since GCC doesn't undefine IR (unlike > LLVM, LLVM has undef/poison). > For vlmul_ext_* RVV intrinsic, GCC will initiate all zeros into > register. However, I think it's not a big issue after we support subreg > livness tracking. As I've suggested elsewhere, let's get the code correct and reasonably complete before we worry about this class of problems. I'm not even convinced it's a big issue right now. > > gcc/ChangeLog: > > * config/riscv/riscv-vector-builtins-bases.cc: Change expand > approach. > * config/riscv/vector.md (@vlmul_extx2<mode>): Remove it. > (@vlmul_extx4<mode>): Ditto. > (@vlmul_extx8<mode>): Ditto. > (@vlmul_extx16<mode>): Ditto. > (@vlmul_extx32<mode>): Ditto. > (@vlmul_extx64<mode>): Ditto. > (*vlmul_extx2<mode>): Ditto. > (*vlmul_extx4<mode>): Ditto. > (*vlmul_extx8<mode>): Ditto. > (*vlmul_extx16<mode>): Ditto. > (*vlmul_extx32<mode>): Ditto. > (*vlmul_extx64<mode>): Ditto. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/rvv/base/pr110109-1.c: New test. > * gcc.target/riscv/rvv/base/pr110109-2.c: New test. Approved. Please commit. Jeff