Hi, On Tue, 19 May 2015, Richard Henderson wrote:
> It is. The relaxation that HJ is working on requires that the reads > from the got not be hoisted. I'm not especially convinced that what > he's working on is a win. > > With LTO, the compiler can do the same job that he's attempting in the > linker, without an extra nop. Without LTO, leaving it to the linker > means that you can't hoist the load and hide the memory latency. Well, hoisting always needs a register, and if hoisted out of a loop (which you all seem to be after) that register is live through the whole loop body. You need a register for each different called function in such loop, trading the one GOT pointer with N other registers. For register-starved machines this is a real problem, even x86-64 doesn't have that many. I.e. I'm not convinced that this hoisting will really be much of a win that often, outside toy examples. Sure, the compiler can hoist function addresses trivially, but I think it will lead to spilling more often than not, or alternatively the hoisting will be undone by the register allocators rematerialization. Of course, this would have to be measured for real not hand-waved, but, well, I'd be surprised if it's not so. Ciao, Michael.