Thank you so much. Kito helped me fix it already.
RVV instruction patterns can have CSE optimizations now.



juzhe.zh...@rivai.ai
 
From: Richard Biener
Date: 2023-02-02 20:26
To: juzhe.zh...@rivai.ai
CC: gcc-patches; kito.cheng; richard.sandiford; jeffreyalaw; apinski
Subject: Re: Re: [PATCH] CPROP: Allow cprop optimization when the function has 
a single block
On Thu, 2 Feb 2023, juzhe.zh...@rivai.ai wrote:
 
> Yeah, Thanks. You are right. CSE should do the job. 
> Now I know the reason CSE failed to optimize is I include 
> VL_REGNUM(66)/VTYPE_RENUM(67) hard reg
> as the dependency of pred_broadcast:
> (insn 19 18 20 4 (set (reg:VNx1DI 152)
> >         (if_then_else:VNx1DI (unspec:VNx1BI [
> >                     (const_vector:VNx1BI repeat [
> >                             (const_int 1 [0x1])
> >                         ])
> >                     (const_int 4 [0x4])
> >                     (const_int 2 [0x2]) repeated x2
> >                     (const_int 0 [0])
> >                     (reg:SI 66 vl)
> >                     (reg:SI 67 vtype)
> >                 ] UNSPEC_VPREDICATE)
> >             (vec_duplicate:VNx1DI (reg/v:DI 148 [ x ]))
> >             (unspec:VNx1DI [
> >                     (const_int 0 [0])
> >                 ] UNSPEC_VUNDEF))) "rvv.c":22:23 695 {pred_broadcastvnx1di}
> >      (nil))
> Then CSE failed to set the 152 as copy.
> 
> VL_REGNUM(66)/VTYPE_RENUM(67) are the global hard reg that I should make each 
> RVV instruction depend on them.
> Since we use vsetvl instruction (which is setting global 
> VL_REGNUM(66)/VTYPE_RENUM(67) status) to set the global status for
> each RVV instruction. 
> Including the dependency here is to make sure the global VL/VTYPE status is 
> correct of each RVV instruction. (If we don't include
> such dependency in RVV instruction, instruction scheduling may move the RVV 
> instructions and vsetvl instructions randomly then
> produce incorrect vsetvl configuration)
> 
> The original reg_class of VL_REGNUM(66)/VTYPE_RENUM(67) I set here:
> riscv_regno_to_class [VL_REGNUM] = VL_REGS;
> riscv_regno_to_class [VTYPE_RENUM] = VTYPE_REGS;
> Such configuration make CSE failed.
> 
> However, if I change the reg_class :
> riscv_regno_to_class [VL_REGNUM] = NO_REGS;
> riscv_regno_to_class [VTYPE_RENUM] = NO_REGS;
> The CSE now can do the optimization now!
> 
> 1) Would you mind telling me the difference between them?
 
No idea.  I think CSE avoids to touch hard register references because
eliding them to copies can increase register pressure.
 
> 2) If I set these 2 global status register as NO_REGS, will it create 
>    issues for the global status configuration of each RVV instructions ?
 
No idea either.  Usually these kind of dependences are introduced
by targets at the point the VL setting is introduced to avoid
pessimizing optimizations earlier.  Often, for cases like a VL
register, this is done after register allocation only and indeed
necessary to avoid the second scheduling pass from breaking things.
 
Richard.
 

Reply via email to