Hi Johannes, > I'd recommend not using such a workaround: > > This means getTLSRange will always return an empty range, but the GC uses > this to scan TLS memory. This means a GC collection can delete objects > which are still pointed to from TLS. This leads to hard to debug errors, > and if I remember correctly, the testsuite will not catch these errors. I > think we have code in phobos though which references objects only from TLS > and this will break after a GC collection.
I fully admit to have been wary about such an approach myself, but was astonished how far it seemed to get me. I suspect the two testsuite regressions (compared to a build with dlpi_tls_modid present) I mentioned are exactly of the kind you mention: e.g. the gdc.test/runnable/testaa.d failures are like this core.exception.rangeer...@gdc.test/runnable/testaa.d(410): Range violation ---------------- /vol/gcc/src/hg/trunk/local/libphobos/libdruntime/core/exception.d:496 onRangeError [0x80f0d2c] /vol/gcc/src/hg/trunk/local/libphobos/libdruntime/core/exception.d:672 _d_arraybounds [0x80f132f] ??:? void testaa.test15() [0x80d7ae4] ??:? _Dmain [0x80dd3fc] before test 1 and gdc.test/runnable/xtest55.d fails like so: core.exception.asserter...@gdc.test/runnable/xtest55.d(19): Assertion failure ---------------- /vol/gcc/src/hg/trunk/local/libphobos/libdruntime/core/exception.d:441 onAssertError [0x7fff55dd3b56] ??:? _Dmain [0x418959] 7FFFBEB00000 7FFFBEB00000 It's a small set admittedly (but there are the libphobos failures as well), but a compiler that leaves its users with a feeling of unreliablity is probably worse than none at all. Just for the record, I saw the same regressions on Linux/x86_64 when I accidentally didn't define _GNU_SOURCE in the configure test for dlpi_tls_modid, producing an equivalent configuration. So this isn't Solaris-specific in any way. > I'm not sure what's a good solution here. EmuTLS has got the same problem, > but I'll post a RFC patch next weekend which would allow to scan the emuTLS > memory. If we somehow make that work, I'd recommend to use emuTLS instead > of native TLS if there's no way to scan the native TLS*. The problem here is that we'd probably need to build gcc twice in this case: once with native TLS for all non-D languages, and a second time with --disable-tls for D. AFAICS TARGET_HAVE_TLS needs to be a compile-time constant and cannot depend on the language being compiled for. > FYI Martin Nowak(in CC) wrote most of the original code for rt.sections so > he's the expert we'd have to ask. > > * Maybe we could implement a more runtime-independent approach to scan > native TLS? > 1) We somehow need to bracket the TLS section (it would have to be > per-shared-library though, we basically need thread-local, hidden > __start_tls and __stop_tls symbols). > 2) We need to emit a hidden _dso_scan_tls function into each D library. > A pointer to this DSO specific function then has to be passed in > CompilerDSOData to _d_dso_registry. > 3) tlsRange has to forward to the correct, DSO specific _dso_scan_tls. > > 2 and 3 are easy but I'm not sure if we can do 1. Right: I suspect 1 would we way more difficult than the __start_minfo/__stop_minfo stuff. I failed to mention another approach in my patch submission, though I alluded to it in PR d/88150: the ldc fork of libdruntime https://github.com/ldc-developers/druntime has in src/rt/sections_ldc.d an implementation of getTLSRange for Illumos/Solaris without dlpi_tls_modid. I managed to adapt it to sections_elf_shared.d, but apart from the fact that it uses undocumented libc internals (which probably don't change between Solaris 10 and 11.4, so that shouldn't be too bad) that implementation only gets you the TLS range for the main executable, so isn't very useful AFAICS. Rainer -- ----------------------------------------------------------------------------- Rainer Orth, Center for Biotechnology, Bielefeld University