On Tue, Sep 03, 2019 at 06:33:26PM -0500, Segher Boessenkool wrote: > On Tue, Sep 03, 2019 at 07:20:13PM -0400, Michael Meissner wrote: > > On Tue, Sep 03, 2019 at 05:56:03PM -0500, Segher Boessenkool wrote: > > > Hi! > > > > > > On Mon, Aug 26, 2019 at 05:43:41PM -0400, Michael Meissner wrote: > > > > /* This file implements a RTL pass that looks for pc-relative loads of > > > > the > > > > address of an external variable using the PCREL_GOT relocation and a > > > > single > > > > load/store that uses that GOT pointer. > > > > > > Does this work better than having a peephole for it? Is there some reason > > > you cannot do this with a peephole? > > > > Yes. Peepholes only look at adjacent insns. > > Huh. Wow. Would you believe I never knew that (or I forgot)? Well, that > explains why peepholes aren't very effective for us at all, alright! > > > This optimization allows the load > > of the GOT address to be separated from the eventual load or store. > > > > Peephole2's are likely too early, because you really, really, really don't > > want > > any other pass moving things around. > > That is a bit worrying... What can go wrong?
As I say in the comments, with PCREL_OPT, you must have exactly one load of the address and one load or store that references the load of the address. If something duplicates one of the loads or stores, or adds another reference to the address, or just moves it so we can't link the loading of the address to the final load/store, it will not work. For stores, the value being stored must be live at both the loading of the address and the store. For loads, the register being loaded must not be used between the loading of the address and the final load. I.e. in: PLD r1,foo@got@pcrel .Lpcrel1: # other instructions .reloc .Lpcrel1-8,R_PPC64_PCREL_OPT,.-(.Lpcrel1-8) LWZ r2,0(r1) If you get lucky and foo is defined in the same compilation unit, this will get turned into: PLWZ r2,foo@pcrel # other instructions NOP If foo is defined in a shared library (or you are linking for a shared library, and foo is defined in the main program or another shared library), you get: PLD r1,.got.foo@pcrel # other instructions LWZ r2,0(r1) .section .got .got.foo: .quad foo So for loads, r2 must not be used between the PLD and LWZ instructions. Similarly for stores: PLD r1,foo@got@pcrel .Lpcrel1: # other instructions .reloc .Lpcrel1-8,R_PPC64_PCREL_OPT,.-(.Lpcrel1-8) stw r2,0(r1) If you get lucky, this becomes: PSTW r2,foo@pcrel # other instructions NOP If foo is defined in a shared library (or you are linking for a shared library, and foo is defined in the main program or another shared library), you get: PLD r1,.got.foo@pcrel # other instructions STW r2,0(r1) .section .got .got.foo: .quad foo So as I said, r2 must be live betweent he PLD and STW, because you don't know if the PLD will be replaced with a PSTW or not. So to keep other passes from 'improving' things, I opted to do the pass as the last pass before final. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797