On Thu, Feb 12, 2015 at 3:18 PM, Ulrich Weigand <uweig...@de.ibm.com> wrote: > Hello, > > we're running into a problem related to use of initial-exec access to > TLS variables in dynamically-loaded libraries. Now, in general, this > is actually not supported. However, there seems to an "inofficial" > extension that allows selected system libraries to use small amounts > of static TLS space to allow critical variables to be defined to use > the initial-exec model even in dynamically-loaded libraries.
This sounds v. similar to the discussion here. https://sourceware.org/ml/libc-alpha/2014-10/msg00134.html though my brain is too frazzled today to remember what the conclusion was. regards Ramana > > One example of a system library that does this is libgomp, the OpenMP > support library provided with GCC. Here's an email thread from the > gcc mailing lists debating the use of the initial-exec model: > > [gomp] Avoid -Wl,-z,nodlopen (PR libgomp/28482) > https://gcc.gnu.org/ml/gcc-patches/2007-05/msg00097.html > > The idea why this is supposed to work is that glibc/ld.so will always > allocate a small amount of surplus static TLS data space at startup. > As long as the total amount of initial-exec TLS variables defined in > dynamically-loaded libraries fits into that extra space, everything > is supposed to work out fine. This could be ensured by allowing > only certain defined system libraries to use this extension. > > However, in fact there is a *second* restriction, which may cause > loading a library requiring static TLS to fail, *even if* there > still is enough surplus space. This is due to the following check > in dl-open.c:dl_open_worker: > > /* For static TLS we have to allocate the memory here and > now. This includes allocating memory in the DTV. But we > cannot change any DTV other than our own. So, if we > cannot guarantee that there is room in the DTV we don't > even try it and fail the load. > > XXX We could track the minimum DTV slots allocated in > all threads. */ > if (! RTLD_SINGLE_THREAD_P && imap->l_tls_modid > DTV_SURPLUS) > _dl_signal_error (0, "dlopen", NULL, N_("\ > cannot load any more object with static TLS")); > > This is a seriously problematic condition for the use case described > above. There is no reasonable way a system library can ensure that, > when it is loaded via dlopen, it gets assigned a module ID not larger > than DTV_SURPLUS (which currently equals 14). > > Specifically, we've had a bug report from a major ISV that one of > their large applications fails to load a plugin via dlopen with > the above error message, which turned out to be because: > - the plugin uses OpenMP and is thus implicitly linked against libgomp > - the main application does not use libgomp, so it gets loaded at dlopen > - at this point, some 150 libraries are already in use > - many of those libraries define (regular!) TLS variables > > Therefore, the TLS module ID of the (indirectly loaded) libgomp ends > up being larger than 14, and the dlopen fails. It doesn't seem to be > the case that the ISV is doing anything "wrong" here; the problem is > caused solely by the interaction of glibc and libgomp. > > It seems to me that something ought to be fixed here. Either the use > of initial-exec variables simply isn't reliably supportable, but then > not even system libraries like libgomp should use it. Or else, glibc > *wants* to support that use case, but then it should do so in a way > that reliably works as long as system libraries adhere to conditions > that are in their power to implement. > > Thinking along the latter lines, it seems the dl_open_worker check > may be overly conservative: > > For static TLS we have to allocate the memory here and > now. This includes allocating memory in the DTV. > > It is not obvious to me that this second sentence is actually true. > > It *is* true that *given the current implementation*, we would fail > if the DTV were not allocated. This is because init_one_static_tls > (in nptl/allocatestack.c) does: > > /* Fill in the DTV slot so that a later LD/GD access will find it. */ > dtv[map->l_tls_modid].pointer.val = dest; > dtv[map->l_tls_modid].pointer.is_static = true; > > which would simply crash if the DTV were not allocated. > > However, I'm not sure why we have to do that at this point. Variables > accessed via the initial-exec model do not actually use the DTV, since > the linker resolves the offsets in the static TLS block directly as > offsets relative to the thread pointer, without using the DTV. > > Of course, if such a variable were to be *also* accessed via a normal > general-dynamic (or local-dynamic) access, *then* we'd need the DTV. > But at this point, the __tls_get_addr routine would get involved, > which would have the chance to set up the DTV entry on the fly, and > (re-)allocate DTV space as needed. It's just that the current > implementation of __tls_get_addr implicitly assumes it is never > called for static TLS modules, and would (wrongly) also allocate the > TLS data area. > > If __tls_get_addr were changed to also work on static TLS modules > (i.e. only allocate the DTV and have it point to the pre-allocated > static TLS data area in such cases), then we wouldn't have to init > the DTV in init_one_static_tls, and then we could do without the > dl_open_worker check. Does this sound reasonable? > > Bye, > Ulrich > > P.S.: Appended is a small test case that shows the issue. Note that > just two libraries using TLS suffice to trigger the problem, because > module IDs are not even reliably re-used after a dlclose ... > > Makefile > ======== > > all: module1.so module2.so main > > clean: > rm -f module.so module1.so module2.so main > > module1.so: module.c > gcc -g -Wall -DMODULE=1 -fpic -shared -o module1.so module.c > > module2.so: module.c > gcc -g -Wall -DMODULE=2 -fpic -shared -o module2.so module.c > > main: main.c > gcc -g -Wall -D_GNU_SOURCE -o main main.c -ldl -lpthread > > main.c > ====== > > #include <stdio.h> > #include <dlfcn.h> > #include <stdlib.h> > #include <pthread.h> > > pthread_t thread_id; > > void *thread_start (void *arg) > { > printf ("Thread started\n"); > for (;;) > ; > } > > void run_thread (void) > { > pthread_create(&thread_id, NULL, &thread_start, NULL); > } > > void *test (const char *name) > { > void *handle, *func; > size_t modid; > > handle = dlopen (name, RTLD_NOW); > if (!handle) > { > printf ("Cannot open %s\n", name); > exit (1); > } > > func = dlsym (handle, "func"); > if (!func) > { > printf ("Cannot find func\n"); > exit (1); > } > > ((void (*)(void))func)(); > > if (dlinfo(handle, RTLD_DI_TLS_MODID, &modid)) > { > printf ("Cannot find TLS module ID\n"); > exit (1); > } > > printf ("Module ID: %ld\n", (long) modid); > > return handle; > } > > int main (void) > { > void *m1, *m2; > int i; > > run_thread (); > > m1 = test ("./module1.so"); > m2 = test ("./module2.so"); > > for (i = 0; i < 100; i++) > { > dlclose (m1); > m1 = test ("./module1.so"); > dlclose (m2); > m2 = test ("./module2.so"); > } > > dlclose (m1); > dlclose (m2); > return 0; > } > > > module.c > ======== > > #include <stdio.h> > > __thread int x __attribute__ ((tls_model ("initial-exec"))); > > void func (void) > { > printf ("Module %d TLS variable is: %d\n", MODULE, x); > } > > > -- > Dr. Ulrich Weigand > GNU/Linux compilers and toolchain > ulrich.weig...@de.ibm.com >