On Wed, Jul 05, 2017 at 08:38:50AM -0700, H.J. Lu wrote: > On x86-64, __tls_get_addr has to realigns stack so that binaries compiled by > GCCs older than GCC 4.9.4: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58066 > > continue to work even if vector instructions are used by functions called > from __tls_get_addr, which assumes 16-byte stack alignment as specified > by x86-64 psABI. > > We are considering to add an alternative interface, ___tls_get_addr, to > glibc, which doesn't realign stack. Compilers, which properly align stack > for TLS, call generate call to ___tls_get_addr, instead of __tls_get_addr, > if ___tls_get_addr is available. > > Any comments?
I think it's unnecessary. The fast path of __tls_get_addr is trivial to write in asm, where alignment doesn't matter, and then the cost of alignment only enters in the slow path anyway. Implementations like glibc and musl that need to be compatible with old binaries can easily do this, and ones which know they'll only be running binaries built without the gcc bug don't need to care. (FWIW, it can probably be done without asm too, just using an intermediate function with the right attribute to force realignment, if you have an attribute to suppress use of vector instructions in the top-level fast-path __tls_get_addr C.) Note that if you make the change and have gcc generate calls to the new ___tls_get_addr symbol, it's going to be problematic for people trying to link to older glibc versions that don't have it. Rich