http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55354



--- Comment #9 from Konstantin Serebryany <konstantin.s.serebryany at gmail dot 
com> 2012-11-18 19:35:43 UTC ---

As dvyuokv@ pointed out, 

-ftls-model=initial-exec improves the situation, but does not fully help. 



Experiment: 



% cat x.c 

__thread int a;

int foo() {

  return a;

}





HORRIBLE: -fPIC -shared

% gcc x.c -O2 -fPIC -shared -o x.so  ; objdump -d x.so  | grep foo.: -A 5 

00000000000006e0 <foo>:

 6e0:   66 48 8d 3d f0 08 20    lea    0x2008f0(%rip),%rdi        # 200fd8

<_DYNAMIC+0x1b8>

 6e7:   00 

 6e8:   66 66 48 e8 10 ff ff    callq  600 <__tls_get_addr@plt>

 6ef:   ff 

 6f0:   8b 00                   mov    (%rax),%eax





NOT-SO-BAD: -fPIC -shared  -ftls-model=initial-exec

% gcc x.c -O2 -fPIC -shared -o x.so  -ftls-model=initial-exec ; objdump -d x.so

 | grep foo.: -A 5 

0000000000000630 <foo>:

 630:   48 8b 05 a9 09 20 00    mov    0x2009a9(%rip),%rax        # 200fe0

<_DYNAMIC+0x1b8>

 637:   64 8b 00                mov    %fs:(%rax),%eax

 63a:   c3                      retq   





GOOD: -fPIE 

% gcc -c x.c -O2 -fPIE -o x.o  ; objdump -d x.o  | grep foo.: -A 5 

0000000000000000 <foo>:

   0:   64 8b 04 25 00 00 00    mov    %fs:0x0,%eax

   7:   00 

   8:   c3                      retq   





So, while -ftls-model=initial-exec improves the TLS performance, it is still 

2x slower than -fPIE. 



For tsan, which does this for *every* memory access in the original program, 

this will cost 5%-10% slowdown. 



For our users this is a big deal, so they will link the static library whenever

possible. Which default is used in gcc -- I don't care that much.

Reply via email to