https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114729
Vineet Gupta <vineetg at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Last reconfirmed|2024-04-15 00:00:00 |2024-4-16 --- Comment #9 from Vineet Gupta <vineetg at gcc dot gnu.org> --- So I stared with the reg being spilled (a1) .L2: beq a1,zero,.L5 # if j[1] == 0 li a2,1 ble a6,s11,.L2 # if j[0] < 1 sd a1,8(sp) # spill (save) .L3: # inner loop start ... blt a2,a6,.L3 # inner loop end ld a1,8(sp) # spill (restore) j .L2 Next was zooming into the inner loop where a1 is being used/clobbered by sched1 and not w/o sched1 with my rudimentary define, use, dead annotation. ------------------------------------------------------------------------------ -fschedule-insns (NOK) | -fno-schedule-insns (OK) ------------------------------------------------------------------------------ 1-def ld a5,%lo(u)(s0) #u, u | 1-def ld a5,%lo(u)(t6) # u, u 2-def srliw a0,a5,16 | 2-def srliw s10,a5,16 3-def srli a1,a5,32 | 1-use sh a5,%lo(_Z1sv)(a4) 1-use sh a5,%lo(_Z1sv)(a3) | 2-dead sh s10,%lo(_Z1sv+2)(a4) ---insn1--- | 3-def srli s10,a5,32 1-use srli a5,a5,48 | 1-use srli a5,a5,48 ---insn2--- | 1-dead sh a5,%lo(_Z1sv+6)(a4) 2-dead sh a0,%lo(_Z1sv+2)(a3) | ---insn1--- 3-dead sh a1,%lo(_Z1sv+4)(a3) | ---insn2--- 1-dead sh a5,%lo(_Z1sv+6)(a3) | 3-dead sh s10,%lo(_Z1sv+4)(a4) The problem seems to be longer live range of 2-def (on left side). If it was used/dead right afte, 3-def won't need a new register. With that insight, I can now start looking into the sched1 dumps of the corresponding BB. ;; 10--> b 0: i 35 r170#0=[r242+low(`u')] :alu:@GR_REGS+1(1)@FP_REGS+0(0) ;; 11--> b 0: i 79 r209=[r229+low(`f')] :alu:GR_REGS+0(0)FP_REGS+1(1) ;; 12--> b 0: i 76 r141=fix(r206) :alu:@GR_REGS+1(1)@FP_REGS+0(-1) ;; 13--> b 0: i 46 r180=zxt(r170,0x10,0x10) :alu:@GR_REGS+1(1)@FP_REGS+0(0) ;; 14--> b 0: i 55 r188=r170 0>>0x20 :alu:GR_REGS+1(1)FP_REGS+0(0) ;; 15--> b 0: i 81 r210=r141<<0x3 :alu:GR_REGS+1(0)FP_REGS+0(0) ;; 16--> b 0: i 82 r211=r143+r210 :alu:GR_REGS+1(0)FP_REGS+0(0) ;; 17--> b 0: i 44 [r230+low(`_Z1sv')]=r170#0 :alu:@GR_REGS+0(0)@FP_REGS+0(0) ;; 18--> b 0: i 65 r197=r170 0>>0x30 :alu:GR_REGS+1(0)FP_REGS+0(0) ;; 19--> b 0: i 54 [r230+low(const(`_Z1sv'+0x2))]=r180#0 :alu:@GR_REGS+0(-1)@FP_REGS+0(0) ;; 20--> b 0: i 64 [r230+low(const(`_Z1sv'+0x4))]=r188#0 :alu:GR_REGS+0(-1)FP_REGS+0(0) ;; 21--> b 0: i 73 [r230+low(const(`_Z1sv'+0x6))]=r197#0 :alu:GR_REGS+0(-1)FP_REGS+0(0)