On 11/4/24 16:45, Jeff Law wrote: > On 10/31/24 1:35 PM, Vineet Gupta wrote: >>>> An INSN can have multiple dependencies/predecessor nodes, some of them >>>> being true dependency REG_DEP_TRUE meaning the predecessor register output >>>> is a must have for the INSN to be scheduled. >>>> e.g. >>>> In the sched1 dump below, insn 70 has multiple deps, but 68 and 69 are >>>> true reg deps: >>>> >>>> ;; | insn | prio | >>>> ;; | 68 | 3 | r217=r144+r146 >>>> ;; | 69 | 5 | r218=flt(r143) >>>> ;; | 70 | 2 | [r217]=r218 >>>> >>>> ;; insn code bb dep prio cost reservation >>>> ;; ---- ---- -- --- ---- ---- ----------- >>>> ;; 70 286 6 6 2 1 alu : FW: 97tnm 91m 83tn >>>> 78m 76tn 72tn >>>> ;; : BK: 69t 68t 57n 60n 61n 64n >>>> ^^^ ^^^ >> Indeed insn 60 is in backwards dependency of insn 70, although its not a >> true/data dependency. >> 'n' in dump above implies DEP_NONREG. > Right. But it might still be a true/data dependency, just one that > isn't though a REG object, but a MEM object. Also note that in some > cases we factor dependencies to prevent the various algorithms in the > scheduler from blowing up. See flush_pending_lists for example. > > >>>> ;; Pressure summary: GR_REGS:29 FP_REGS:3 >>> So at a 30k foot level, does the model schedule need to actually be a >>> correct schedule? >> From my reading of the code and the background posts pointed to by Richard >> [1], >> it doesn't have to be - its sole purpose is to compute the *minimum* >> register pressure. > My default position would be that the schedule would still need to > produce valid code. Otherwise the "minimum pressure" concept just > doesn't make much sense. I could reorder the insns in all kinds of > interesting ways that would produce a smaller pressure artifically if I > wasn't constrained by the need to produce a schedule that honored the > semantics of the original code.
Fair enough ! But does that mean any load will be potentially scheduled ahead of any store, especially if the loads address is not computed in the BB and lacking specific non-aliasing annotations or toggles. > I didn't see anything in Richard's 2011 post that would indicate that > the model schedule should have enough freedom to produce an invalid > schedule. True. >> So model schedule is indeed ordering them as 70 -> 60 >> ;; | 0 56 | 0 8 0 15 | r210=flt(r185#0) >> GR_REGS:[25,+0] FP_REGS:[0,+1] >> ;; | 1 57 | 0 8 1 12 | [r242+low(`e')]=r210 >> GR_REGS:[25,+0] FP_REGS:[1,-1] >> ;; | 2 63 | 2 7 0 12 | r215=r141+r183 >> GR_REGS:[25,+1] FP_REGS:[0,+0] >> ;; | 3 64 | 1 8 2 11 | r216=[r215] >> GR_REGS:[26,-1] FP_REGS:[0,+1] >> ;; | 4 65 | 0 8 3 8 | r143=fix(r216) >> GR_REGS:[25,+1] FP_REGS:[1,-1] >> ;; | 5 67 | 0 8 4 4 | r146=r143<<0x3 >> GR_REGS:[26,+1] FP_REGS:[0,+0] >> ;; | 6 68 | 0 8 5 3 | r217=r144+r146 >> GR_REGS:[27,+1] FP_REGS:[0,+0] >> ;; | 7 69 | 4 7 4 5 | r218=flt(r143) >> GR_REGS:[28,+0] FP_REGS:[0,+1] >> ;; | 8 70 | 3 8 6 2 | [r217]=r218 >> GR_REGS:[28,-1] FP_REGS:[1,-1] >> ;; | 9 58 | 6 4 0 5 | r211=r138+r183 >> GR_REGS:[27,+1] FP_REGS:[0,+0] >> ;; | 10 60 | 5 5 2 4 | r213=[r211] >> GR_REGS:[28,-1] FP_REGS:[0,+1] > Interesting. Was that without your change, or only with your change? With my change. >> And Main scheduler reorders them back in orig order 60 -> 70 > As it probably must to produce valid code barring something which tells > us the two memory objects don't alias. Yeah need to look if that is really causal or it just happens to be... >>> And if it doesn't strictly need to be a valid schedule are we giving an >>> overly-optimistic view of the best that can be done from a pressure >>> standpoint with this change? And if so, is that wise? >> As I mentioned above, the design goal of model schedule is to keep pressure >> to min. >> So indeed we might be a bit more optimistic than reality here. But main list >> scheduler fixes >> that if that leads to undesired outcomes. What we are trying to do here is >> not pessimize >> in certain cases, especially when that's not by design but just an outcome >> of the >> implementation subtlety. > And my point is that if I'm allowed to generate a minimal register > pressure schedule without needing it to generate the same semantics as > the original code, then I could drastically reduce the register pressure > :-) But I'd also claim that doing so isn't actually useful. > > The mental model I'd work from is we want to know the minimal pressure > while still preserving the original code semantics -- unless someone who > knows this better (ie Richard S) were to say otherwise. OK I'll see how we can tighten the constraints a bit more - I still feel there is value in this approach, of course valid codegen trumps everything. Given where we are in the release cycle, I'll post v2 with just patch 1/4 (and 2/4 squashed) with --param approach. Thx, -Vineet