On Sat, Sep 01, 2012 at 12:48:50AM +0200, Jilles Tjoelker wrote: > On Tue, Aug 28, 2012 at 02:03:22PM +0300, Konstantin Belousov wrote: > > On Sat, Aug 25, 2012 at 12:16:55AM +0200, Jilles Tjoelker wrote: > > > Not exporting .cerror causes it to be jumped to directly instead of via > > > the PLT. > > > > The below patch is for i386 only and also takes advantage of .cerror's > > > new status by not saving and loading %ebx before jumping to it. > > > (Therefore, .cerror now saves and loads %ebx itself.) Where there was a > > > conditional jump to a jump to .cerror, the conditional jump has been > > > changed to jump to .cerror directly (many modern CPUs don't do static > > > prediction and in any case it is not much of a benefit anyway). > > > Why do you need to save/restore the %ebx at all ? %ebx == > > &__GLOBAL_OFFSET_TABLE__ is only needed when you access GOT, but .cerror > > only works with PLT, which is addressed using the instruction capable of > > relative addressing. The old .cerror does not need it as well, but it is > > just engraved in the function ABI. > > On i386, a shared object's PLT entry needs %ebx set up to work properly. > This is because such a PLT entry needs to access the GOT to find the > address to jump to (the first instruction is jmp *d32(%ebx)). > > An executable's PLT entry accesses the GOT via absolute addressing and > therefore does not need %ebx. Doh, right.
Still, this manipulations can be removed, we just need to resolve __error in some libc ctr. It is not very important after your patch, because ABI is not much more regular, but I think removing additional stack operations is still beneficial. > > > > The patch decreases the size of libc.so.7 by a few kilobytes. > > > > Similar changes could be made to other architectures, and there may be > > > more symbols that are exported but need not be. > > Sure, would you handle at least amd64 too ? > > The below patch handles amd64. > > I'm a bit annoyed that most of the syscall stubs are 17 bytes long now > and have the maximum 15 bytes of padding. This means that the patch > provides virtually no gain in code size. Stubs can be converted to do only load of the syscall number into %rax and unconditional jump to common code which would perform kernel call and do post-syscall bookkeeping to update errno. Otherwise, looks good.
pgpYhjmDsmTnD.pgp
Description: PGP signature