Hi! On Fri, Nov 15, 2019 at 07:17:34PM -0500, Michael Meissner wrote: > This series of patches adds the PCREL_OPT optimization for the PC-relative > support in the PowerPC compiler. > > This optimization convert a single load or store of an external variable to > use > the R_PPC64_PCREL_OPT relocation. > > For example, a normal load of an external int variable (with -mcpu=future) > would generate: > > PLD 9,ext_symbol@got@pcrel(0),1 > LWA 10,0(9) > > That is, load the address of 'ext_symbol' into register 9. If 'ext_symbol' is > defined in another module in the main program, and the current module is also > in the main program, the linker will optimize this to:
What does "module" mean? Translation unit? Object file? And "main program" is what ELF calls "executable", right? You don't need to say that here, it is not something the compiler can do anything about. You could just say "if possible, the linker will..." etc. > PADDI 9,ext_symbol(0),1 > LWZ 10,0(9) I don't think it will change an lwa insn to an lwz? Probably it should be lwz throughout? Is that "paddi" syntax correct? I think you might mean "paddi 9,0,ext_symbol,1", aka "pla 9,ext_symbol"? > If either the definition is not in the main program or we are linking for a > shared library, the linker will create an address in the .got section and do a > PLD of it: > > .section .got > .got.ext_symbol: > .quad ext_symbol > > .section .text > PLD 9,.got.ext_symbol(0),1 > LWZ 10,0(9) Like what the user wrote, sure -- the linker does not optimise it, does not change it? Or am I missing something? > If the only use of the GOT address is a single load and store, we can optimize > this further: A single load *or* store. > PLD 9,ext_symbol@got@pcrel(0),1 > .Lpcrel1: > .reloc .Lpcrel1-8,R_PPC64_PCREL_OPT,.-(.Lpcrel1-8) > LWZ 10,0(9) > > In this case, if the variable is defined in another module for the main > program, and we are linking for the main program, the linker will transform > this to: > > PLWZ 10,ext_symbol@pcrel(0),1 > NOP > > There can be arbitrary instructions between the PLD and the LWA (or STW). ... because that is what that relocation means. The compiler still has to make sure that any such insns should not prevent this transform. > For either loads or store, register 9 must only be used in the load or store, > and must die at that point. > > For loads, there must be no reference to register 10 between the PLD and the > LWZ. For a store, register 10 must be live at the PLD instruction, and must > not be modified between the PLD and the STW instructions. "No reference"... Nothing indirect either (like from a function call, or simply some insn that does not name the register directly). Or code like pld 9,ext_symbol@got@pcrel(0),1 ; .Lpcrel1: .reloc .Lpcrel1-8,R_PPC64_PCREL_OPT,.-(.Lpcrel1-8) b 2f here: # some code that does not explicitly reference r10 here, # but r10 is live here nevertheless, and is used later b somewhere_else 2: lwz 10,0(9) complicates your analysis, too. So something DF is needed here, or there are lots and lots and lots of cases to look out for. Segher