On Wed, Jul 24, 2013 at 5:23 PM, Ian Lance Taylor <i...@google.com> wrote: >> * The foo@plt pseudo-symbols that e.g. objdump will display are based on >> the BFD backend knowing the size of PLT entries. Arguably this ought >> to look at sh_entsize of .plt instead of using baked-in knowledge, but >> it doesn't. > > This seems fixable. Of course, we could also keep the PLT the same > length by changing it. The current PLT entries are > > jmpq *GOT(sym) > pushq offset > jmpq plt0 > > The linker or dynamic linker initializes *GOT(sym) to point to the > second instruction in this sequence. So we can keep the PLT at 16 > bytes by simply changing it to jump somewhere else. > > bnd jmpq *GOT(sym) > .skip 9 > > We have the linker or dynamic linker fill in *GOT(sym) to point to the > second PLT table. When the dynamic linker is involved, we use another > DT tag to point to the second PLT. The offsets are consistent: there > is one entry in each PLT table, so the dynamic linker can compute the > right value. Then in the second PLT we have the sequence > > pushq offset > bnd jmpq plt0 > > That gives the dynamic linker the offset that it needs to update > *GOT(sym) to point to the runtime symbol value. So we get slightly > worse instruction cache handling the first time a function is called, > but after that we are the same as before. And PLT entries are the > same size as always so everything is simpler. > > The special DT tag will tell the dynamic linker to apply the special > processing. No attribute is needed to change behaviour. The issue > then is: a program linked in this way will not work with an old > dynamic linker, because the old dynamic linker will not initialize > GOT(sym) to the right value. That is a problem for any scheme, so I > think that is OK. But if that is a concern, we could actually handle > by generating two PLTs. One conventional PLT, and another as I just > outlined. The linker branches to the new PLT, and initializes > GOT(sym) to point to the old PLT. The dynamic linker spots this > because it recognizes the new DT tags, and cunningly rewrites the GOT > to point to the new PLT. Cost is an extra jump the first time a > function is called when using the old dynamic linker. >
I don't like the complexity. I believe extending PLT entry to 32 byte works with the old ld.so. If we are willing to have mixed PLT entry, we merge 2 16-byte PLT entries into one super 32-byte PLT entry so that we can have jmpq *name@GOTPCREL(%rip) pushq $index jmpq PLT0 bnd jmpq *name@GOTPCREL(%rip) pushq $index bnd jmpq PLT0 nop paddings jmpq *name@GOTPCREL(%rip) pushq $index jmpq PLT0 We can also have new link-time relocations for branches with BND prefix and only create the super PLT entries when needed. Of course,. unwind info may be incorrect for both approach if we don't find a way to fix it. -- H.J.