On 10/31/24 1:35 PM, Vineet Gupta wrote:
An INSN can have multiple dependencies/predecessor nodes, some of them
being true dependency REG_DEP_TRUE meaning the predecessor register output
is a must have for the INSN to be scheduled.
e.g.
In the sched1 dump below, insn 70 has multiple deps, but 68 and 69 are
true reg deps:
;; | insn | prio |
;; | 68 | 3 | r217=r144+r146
;; | 69 | 5 | r218=flt(r143)
;; | 70 | 2 | [r217]=r218
;; insn code bb dep prio cost reservation
;; ---- ---- -- --- ---- ---- -----------
;; 70 286 6 6 2 1 alu : FW: 97tnm 91m 83tn 78m 76tn
72tn
;; : BK: 69t 68t 57n 60n 61n 64n
^^^ ^^^
Indeed insn 60 is in backwards dependency of insn 70, although its not a
true/data dependency.
'n' in dump above implies DEP_NONREG.
Right. But it might still be a true/data dependency, just one that
isn't though a REG object, but a MEM object. Also note that in some
cases we factor dependencies to prevent the various algorithms in the
scheduler from blowing up. See flush_pending_lists for example.
;; Pressure summary: GR_REGS:29 FP_REGS:3
So at a 30k foot level, does the model schedule need to actually be a
correct schedule?
From my reading of the code and the background posts pointed to by Richard [1],
it doesn't have to be - its sole purpose is to compute the *minimum* register
pressure.
My default position would be that the schedule would still need to
produce valid code. Otherwise the "minimum pressure" concept just
doesn't make much sense. I could reorder the insns in all kinds of
interesting ways that would produce a smaller pressure artifically if I
wasn't constrained by the need to produce a schedule that honored the
semantics of the original code.
I didn't see anything in Richard's 2011 post that would indicate that
the model schedule should have enough freedom to produce an invalid
schedule.
[1] https://gcc.gnu.org/legacy-ml/gcc-patches/2011-12/msg01684.html
| `The idea was to
construct a preliminary
| "model" schedule in which the only objective is to keep register
| pressure to a minimum. This schedule ignores pipeline characteristics,
| latencies, and the number of available registers. The maximum pressure
| seen in this initial model schedule (MP) is then the benchmark for ECC(X).`
It creates the initial arrangement of insns but stuff does get moved around by
main scheduler (seems to happen for this very case too, see below)
If so, then we probably can't do what you're trying
to do. We have a memory dependency between insns 60 and 70. If r211
and r217 are potentially the same, then swapping those two instructions
is unsafe.
Whao ! steely eyes.
So model schedule is indeed ordering them as 70 -> 60
;; | 0 56 | 0 8 0 15 | r210=flt(r185#0)
GR_REGS:[25,+0] FP_REGS:[0,+1]
;; | 1 57 | 0 8 1 12 | [r242+low(`e')]=r210
GR_REGS:[25,+0] FP_REGS:[1,-1]
;; | 2 63 | 2 7 0 12 | r215=r141+r183
GR_REGS:[25,+1] FP_REGS:[0,+0]
;; | 3 64 | 1 8 2 11 | r216=[r215]
GR_REGS:[26,-1] FP_REGS:[0,+1]
;; | 4 65 | 0 8 3 8 | r143=fix(r216)
GR_REGS:[25,+1] FP_REGS:[1,-1]
;; | 5 67 | 0 8 4 4 | r146=r143<<0x3
GR_REGS:[26,+1] FP_REGS:[0,+0]
;; | 6 68 | 0 8 5 3 | r217=r144+r146
GR_REGS:[27,+1] FP_REGS:[0,+0]
;; | 7 69 | 4 7 4 5 | r218=flt(r143)
GR_REGS:[28,+0] FP_REGS:[0,+1]
;; | 8 70 | 3 8 6 2 | [r217]=r218
GR_REGS:[28,-1] FP_REGS:[1,-1]
;; | 9 58 | 6 4 0 5 | r211=r138+r183
GR_REGS:[27,+1] FP_REGS:[0,+0]
;; | 10 60 | 5 5 2 4 | r213=[r211]
GR_REGS:[28,-1] FP_REGS:[0,+1]
Interesting. Was that without your change, or only with your change?
And Main scheduler reorders them back in orig order 60 -> 70
As it probably must to produce valid code barring something which tells
us the two memory objects don't alias.
And if it doesn't strictly need to be a valid schedule are we giving an
overly-optimistic view of the best that can be done from a pressure
standpoint with this change? And if so, is that wise?
As I mentioned above, the design goal of model schedule is to keep pressure to
min.
So indeed we might be a bit more optimistic than reality here. But main list
scheduler fixes
that if that leads to undesired outcomes. What we are trying to do here is not
pessimize
in certain cases, especially when that's not by design but just an outcome of
the
implementation subtlety.
And my point is that if I'm allowed to generate a minimal register
pressure schedule without needing it to generate the same semantics as
the original code, then I could drastically reduce the register pressure
:-) But I'd also claim that doing so isn't actually useful.
The mental model I'd work from is we want to know the minimal pressure
while still preserving the original code semantics -- unless someone who
knows this better (ie Richard S) were to say otherwise.
jeff