https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114729

--- Comment #10 from Vineet Gupta <vineetg at gcc dot gnu.org> ---
Debug update -fsched-verbose=99 dumps (they are reaaaaalllly verbose)

For the insn/regs under consideration, the canonical pre-scheduled sequence
with ideal live-range (but non-ideal load-to-use delay) is following

  ;;   ======================================================
  ;;   -- basic block 3 from 17 to 98 -- before reload
  ;;   ======================================================

  ;;    |   35 |   10 | r170#0=[r242+low(`u')]         alu
  ;;    |   44 |    6 | [r230+low(`_Z1sv')]=r170#0     alu

  ;;    |   46 |    7 | r180=zxt(r170,0x10,0x10)       alu
  ;;    |   54 |    6 | [r230+low(const(`_Z1sv'+0x2))]=r180#0 alu

  ;;    |   55 |    7 | r188=r170 0>>0x20              alu
  ;;    |   64 |    6 | [r230+low(const(`_Z1sv'+0x4))]=r188#0 alu

  ;;    |   65 |    7 | r197=r170 0>>0x30              alu
  ;;    |   73 |    6 | [r230+low(const(`_Z1sv'+0x6))]=r197#0 alu

r170 (insn 35) is the central character whose live range has to be longest 
because of dependencies.

 - {46, 55, 65} USE r170, and sources which create new pseudos
 - {54, 64, 73} are where these new pseudos sink.

How these 2 sets are interleaved defines the register pressure.
 - If above src1:sink1:src2:sink2:src3:sink3: 1 reg suffices
 - If src1:src2:src3:                         3 reg needed

Per sched1 dumps, the "source" set gets inducted into the ready queue together:

  ;;    dependencies resolved: insn 65
  ;;    tick updated: insn 65 into ready
  ;;    dependencies resolved: insn 55
  ;;    tick updated: insn 55 into ready
  ;;    dependencies resolved: insn 46
  ;;    tick updated: insn 46 into ready
  ;;    dependencies resolved: insn 44
  ;;    tick updated: insn 44 into ready
  ;;    +------------------------------------------------------
  ;;    | Pressure costs for ready queue
  ;;    |  pressure points GR_REGS:[26->28 at 17:54] FP_REGS:[1->1 at 0:94]
  ;;    +------------------------------------------------------
  ;;    |  15   44 |    6  +3 | GR_REGS:[0 base cost 0] FP_REGS:[0 base cost 0]
  ;;    |  16   46 |    7  +3 | GR_REGS:[1 base cost 0] FP_REGS:[0 base cost 0]
               ^^^^
  ;;    |  18   55 |    7  +3 | GR_REGS:[1 base cost 1] FP_REGS:[0 base cost 0]
               ^^^^
  ;;    |  20   65 |    7  +3 | GR_REGS:[1 base cost 1] FP_REGS:[0 base cost 0]
               ^^^^
  ;;    |  11   76 |   10  +2 | GR_REGS:[1 base cost 0] FP_REGS:[-1 base cost
0]
  ;;    |   0   94 |    2  +1 | GR_REGS:[0 base cost 0] FP_REGS:[0 base cost 0]
  ;;    |  28   92 |    5  +1 | GR_REGS:[0 base cost 0] FP_REGS:[1 base cost 0]
  ;;    |  26   88 |    5  +1 | GR_REGS:[0 base cost 0] FP_REGS:[1 base cost 0]
  ;;    |  22   79 |    9  +1 | GR_REGS:[0 base cost 0] FP_REGS:[1 base cost 0]
  ;;    +------------------------------------------------------
  ;;      RFS_PRESSURE_DELAY: 7: 44 46 76 94
  ;;            RFS_PRIORITY: 6: 92 88 79
  ;;      RFS_PRESSURE_INDEX: 2: 55
  ;;    Ready list (t =  10):    65:44(cost=1:prio=7:delay=3:idx=20) 
55:42(cost=1:prio=7:delay=3:idx=18)  44:39(cost=0:prio=6:delay=3:idx=15) 
46:40(cost=0:prio=7:delay=3:idx=16)  76:47(cost=0:prio=10:delay=2:idx=11) 
94:58(cost=0:prio=2:delay=1:idx=0)  92:56(cost=0:prio=5:delay=1:idx=28) 
88:54(cost=0:prio=5:delay=1:idx=26)  79:48(cost=0:prio=9:delay=1:idx=22)

As the algorithm converges, they move around a bit, but rarely are the src/sink
considered in same iteration and if at all only 1

  ;;    +------------------------------------------------------
  ;;    | Pressure costs for ready queue
  ;;    |  pressure points GR_REGS:[29->29 at 0:94] FP_REGS:[1->1 at 0:94]
  ;;    +------------------------------------------------------

...

  ;;    |  19   64 |    6  +0 | GR_REGS:[-1 base cost -1] FP_REGS:[0 base cost
0]
  ;;    |  17   54 |    6  +0 | GR_REGS:[-1 base cost -1] FP_REGS:[0 base cost
0]
  ;;    |  20   65 |    7  +0 | GR_REGS:[0 base cost 0] FP_REGS:[0 base cos


All of this leads to the pessimistic schedule emitted in the end.

I'm still trying to wrap my head around the humungous dump info.

Reply via email to