On Mon, Jan 08, 2018 at 08:25:47PM +0000, Wilco Dijkstra wrote: > > Always pairing two registers together *also* degrades code quality. > > No, while it's not optimal, it means smaller code and fewer memory accesses.
It means you execute *more* memory accesses. Always. This may be sometimes hidden, sure. I'm not saying you do not want more ldp's; I'm saying this particular strategy is very far from ideal. > >> Another issue is that after pro_and_epilogue pass I see multiple restores > >> of the same registers and then a branch to the same block. We should try > >> to avoid the unnecessary duplication. > > > > It already does that if *all* predecessors of that block do that. If you > > want to do it in other cases, you end up with more jumps. That may be > > beneficial in some cases, of course, but it is not an obvious win (and in > > the general case it is, hrm let's use nice words, "terrible"). > > That may well be the problem. So if there are N predecessors, of which N-1 > need to restore the same set of callee saves, but one was shrinkwrapped, > N-1 copies of the same restores might be emitted. N could be the number > of blocks in a function - I really hope it doesn't work out like that... In the worst case it would. OTOH, joining every combo into blocks costs O(2**C) (where C is the # components) bb's worst case. It isn't a simple problem. The current tuning works pretty well for us, but no doubt it can be improved! Segher