http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55354
--- Comment #17 from Dmitry Vyukov <dvyukov at google dot com> 2012-11-19 10:53:04 UTC --- >When building libtsan as a shared library (for which I had to hack our assembly >blobs a bit) we get two sources of slowdown: > 1. __tsan_read8 and friends are called through PLT > 2. __tsan_read8 and friends use one extra load to get to TLS > I bet 9.5% or more of that is due to the PLT call. That's not the overhead you are looking for, Luke. We currently compile with -fPIC and link statically, linker inserts only 1 memory dereference in this case. However, -fPIC affects code generation in compiler, it has to reserve more registers for tls access code and has to allocate stack frame because of the potential call. Only that causes *20%* slowdown on a real application (not a synthetic benchmark). Kostya, to evaluate initial-exec you need to insure that code characteristics of __tsan_read/write are not affected, i.e. 0 stack spills and analyze script passes. Everything else we have w/o initial-exec.