On 2/2/23 05:26, Richard Biener wrote:
On Thu, 2 Feb 2023, juzhe.zh...@rivai.ai wrote:

Yeah, Thanks. You are right. CSE should do the job.
Now I know the reason CSE failed to optimize is I include 
VL_REGNUM(66)/VTYPE_RENUM(67) hard reg
as the dependency of pred_broadcast:
(insn 19 18 20 4 (set (reg:VNx1DI 152)
         (if_then_else:VNx1DI (unspec:VNx1BI [
                     (const_vector:VNx1BI repeat [
                             (const_int 1 [0x1])
                         ])
                     (const_int 4 [0x4])
                     (const_int 2 [0x2]) repeated x2
                     (const_int 0 [0])
                     (reg:SI 66 vl)
                     (reg:SI 67 vtype)
                 ] UNSPEC_VPREDICATE)
             (vec_duplicate:VNx1DI (reg/v:DI 148 [ x ]))
             (unspec:VNx1DI [
                     (const_int 0 [0])
                 ] UNSPEC_VUNDEF))) "rvv.c":22:23 695 {pred_broadcastvnx1di}
      (nil))
Then CSE failed to set the 152 as copy.

VL_REGNUM(66)/VTYPE_RENUM(67) are the global hard reg that I should make each 
RVV instruction depend on them.
Since we use vsetvl instruction (which is setting global 
VL_REGNUM(66)/VTYPE_RENUM(67) status) to set the global status for
each RVV instruction.
Including the dependency here is to make sure the global VL/VTYPE status is 
correct of each RVV instruction. (If we don't include
such dependency in RVV instruction, instruction scheduling may move the RVV 
instructions and vsetvl instructions randomly then
produce incorrect vsetvl configuration)

The original reg_class of VL_REGNUM(66)/VTYPE_RENUM(67) I set here:
riscv_regno_to_class [VL_REGNUM] = VL_REGS;
riscv_regno_to_class [VTYPE_RENUM] = VTYPE_REGS;
Such configuration make CSE failed.

However, if I change the reg_class :
riscv_regno_to_class [VL_REGNUM] = NO_REGS;
riscv_regno_to_class [VTYPE_RENUM] = NO_REGS;
The CSE now can do the optimization now!

1) Would you mind telling me the difference between them?

No idea.  I think CSE avoids to touch hard register references because
eliding them to copies can increase register pressure.IIRC the costing is set up differently and for a given partition a
pseudo will be preferred over a hard reg. This is in addition to other places that test the small register class hooks.




2) If I set these 2 global status register as NO_REGS, will it create
    issues for the global status configuration of each RVV instructions ?

No idea either.  Usually these kind of dependences are introduced
by targets at the point the VL setting is introduced to avoid
pessimizing optimizations earlier.  Often, for cases like a VL
register, this is done after register allocation only and indeed
necessary to avoid the second scheduling pass from breaking things.
Yea. I'm wondering about when the right place to introduce these dependencies might be. I'm still a few months out from worrying about RVV, but it's not too far away.
jeff

Reply via email to