On Mon, Dec 02, 2019 at 06:07:23PM -0600, Segher Boessenkool wrote: > Hi! > > On Fri, Nov 15, 2019 at 07:17:34PM -0500, Michael Meissner wrote: > > This series of patches adds the PCREL_OPT optimization for the PC-relative > > support in the PowerPC compiler. > > > > This optimization convert a single load or store of an external variable to > > use > > the R_PPC64_PCREL_OPT relocation. > > > > For example, a normal load of an external int variable (with -mcpu=future) > > would generate: > > > > PLD 9,ext_symbol@got@pcrel(0),1 > > LWA 10,0(9) > > > > That is, load the address of 'ext_symbol' into register 9. If 'ext_symbol' > > is
If you want to show a load of an int ext_symbol, the example should be pld 9,ext_symbol@got@pcrel lwz 10,0(9) Add "(0),1" to the end of the pld line to show the optional operands. > > defined in another module in the main program, and the current module is > > also > > in the main program, the linker will optimize this to: > > What does "module" mean? Translation unit? Object file? And "main > program" is what ELF calls "executable", right? "relocatable object file" and "executable or shared library" respectively. If the linker is creating a shared library, and ext_symbol is local to that library by virtue of non-default symbol visibility or symbol versioning, then the optimisation will be done for shared libraries too. > You don't need to say that here, it is not something the compiler can do > anything about. You could just say "if possible, the linker will..." etc. > > > PADDI 9,ext_symbol(0),1 > > LWZ 10,0(9) > > I don't think it will change an lwa insn to an lwz? Probably it should > be lwz throughout? Yes, see above. Changing got indirect to pc-relative (or toc-relative for that matter) is something the linker does even in the absence of PCREL_OPT relocs. That's what Mike was trying to show with the above transformation, modulo typos. > Is that "paddi" syntax correct? I think you might mean > "paddi 9,0,ext_symbol,1", aka "pla 9,ext_symbol"? No, it's not correct but your corrections aren't correct either. :) pla 9,ext_symbol@pcrel # add (0),1 for optional operands or paddi 9,0,ext_symbol@pcrel,1 You'll get the wrong reloc without @pcrel. > > If either the definition is not in the main program or we are linking for a > > shared library, the linker will create an address in the .got section and > > do a > > PLD of it: > > > > .section .got > > .got.ext_symbol: > > .quad ext_symbol > > > > .section .text > > PLD 9,.got.ext_symbol(0),1 > > LWZ 10,0(9) > > Like what the user wrote, sure -- the linker does not optimise it, does > not change it? Or am I missing something? > > > If the only use of the GOT address is a single load and store, we can > > optimize > > this further: > > A single load *or* store. > > > PLD 9,ext_symbol@got@pcrel(0),1 > > .Lpcrel1: > > .reloc .Lpcrel1-8,R_PPC64_PCREL_OPT,.-(.Lpcrel1-8) > > LWZ 10,0(9) > > > > In this case, if the variable is defined in another module for the main > > program, and we are linking for the main program, the linker will transform > > this to: > > > > PLWZ 10,ext_symbol@pcrel(0),1 > > NOP > > > > There can be arbitrary instructions between the PLD and the LWA (or STW). > > ... because that is what that relocation means. The compiler still has > to make sure that any such insns should not prevent this transform. Right, and that's the hard part of this transformation. > > For either loads or store, register 9 must only be used in the load or > > store, > > and must die at that point. > > > > For loads, there must be no reference to register 10 between the PLD and the > > LWZ. For a store, register 10 must be live at the PLD instruction, and must > > not be modified between the PLD and the STW instructions. > > "No reference"... Nothing indirect either (like from a function call, > or simply some insn that does not name the register directly). Or code > like > > pld 9,ext_symbol@got@pcrel(0),1 ; .Lpcrel1: > .reloc .Lpcrel1-8,R_PPC64_PCREL_OPT,.-(.Lpcrel1-8) > b 2f > > here: # some code that does not explicitly reference r10 here, > # but r10 is live here nevertheless, and is used later > b somewhere_else > > 2: lwz 10,0(9) > > complicates your analysis, too. So something DF is needed here, or > there are lots and lots and lots of cases to look out for. > > > Segher -- Alan Modra Australia Development Lab, IBM