On Wed, 30 Sep 2020 at 22:44, Jonathan Wakely <jwak...@redhat.com> wrote: > > On 30/09/20 16:03 +0100, Jonathan Wakely wrote: > >On 29/09/20 13:51 +0200, Christophe Lyon via Libstdc++ wrote: > >>On Sat, 26 Sep 2020 at 21:42, Jonathan Wakely via Gcc-patches > >><gcc-patches@gcc.gnu.org> wrote: > >>> > >>>Glibc 2.32 adds a global variable that says whether the process is > >>>single-threaded. We can use this to decide whether to elide atomic > >>>operations, as a more precise and reliable indicator than > >>>__gthread_active_p. > >>> > >>>This means that guard variables for statics and reference counting in > >>>shared_ptr can use less expensive, non-atomic ops even in processes that > >>>are linked to libpthread, as long as no threads have been created yet. > >>>It also means that we switch to using atomics if libpthread gets loaded > >>>later via dlopen (this still isn't supported in general, for other > >>>reasons). > >>> > >>>We can't use __libc_single_threaded to replace __gthread_active_p > >>>everywhere. If we replaced the uses of __gthread_active_p in std::mutex > >>>then we would elide the pthread_mutex_lock in the code below, but not > >>>the pthread_mutex_unlock: > >>> > >>> std::mutex m; > >>> m.lock(); // pthread_mutex_lock > >>> std::thread t([]{}); // __libc_single_threaded = false > >>> t.join(); > >>> m.unlock(); // pthread_mutex_unlock > >>> > >>>We need the lock and unlock to use the same "is threading enabled" > >>>predicate, and similarly for init/destroy pairs for mutexes and > >>>condition variables, so that we don't try to release resources that were > >>>never acquired. > >>> > >>>There are other places that could use __libc_single_threaded, such as > >>>_Sp_locker in src/c++11/shared_ptr.cc and locale init functions, but > >>>they can be changed later. > >>> > >>>libstdc++-v3/ChangeLog: > >>> > >>> PR libstdc++/96817 > >>> * include/ext/atomicity.h (__gnu_cxx::__is_single_threaded()): > >>> New function wrapping __libc_single_threaded if available. > >>> (__exchange_and_add_dispatch, __atomic_add_dispatch): Use it. > >>> * libsupc++/guard.cc (__cxa_guard_acquire, __cxa_guard_abort) > >>> (__cxa_guard_release): Likewise. > >>> * testsuite/18_support/96817.cc: New test. > >>> > >>>Tested powerpc64le-linux, with glibc 2.31 and 2.32. Committed to trunk. > >> > >>Hi, > >> > >>This patch introduced regressions on armeb-linux-gnueabhf: > >>--target armeb-none-linux-gnueabihf --with-cpu cortex-a9 > >> g++.dg/compat/init/init-ref2 cp_compat_x_tst.o-cp_compat_y_tst.o execute > >> g++.dg/cpp2a/decomp1.C -std=gnu++14 execution test > >> g++.dg/cpp2a/decomp1.C -std=gnu++17 execution test > >> g++.dg/cpp2a/decomp1.C -std=gnu++2a execution test > >> g++.dg/init/init-ref2.C -std=c++14 execution test > >> g++.dg/init/init-ref2.C -std=c++17 execution test > >> g++.dg/init/init-ref2.C -std=c++2a execution test > >> g++.dg/init/init-ref2.C -std=c++98 execution test > >> g++.dg/init/ref15.C -std=c++14 execution test > >> g++.dg/init/ref15.C -std=c++17 execution test > >> g++.dg/init/ref15.C -std=c++2a execution test > >> g++.dg/init/ref15.C -std=c++98 execution test > >> g++.old-deja/g++.jason/pmf7.C -std=c++98 execution test > >> g++.old-deja/g++.mike/leak1.C -std=c++14 execution test > >> g++.old-deja/g++.mike/leak1.C -std=c++17 execution test > >> g++.old-deja/g++.mike/leak1.C -std=c++2a execution test > >> g++.old-deja/g++.mike/leak1.C -std=c++98 execution test > >> g++.old-deja/g++.other/init19.C -std=c++14 execution test > >> g++.old-deja/g++.other/init19.C -std=c++17 execution test > >> g++.old-deja/g++.other/init19.C -std=c++2a execution test > >> g++.old-deja/g++.other/init19.C -std=c++98 execution test > >> > >>and probably some (280) in libstdc++ tests: (I didn't bisect those): > >> 19_diagnostics/error_category/generic_category.cc execution test > >> 19_diagnostics/error_category/system_category.cc execution test > >> 20_util/scoped_allocator/1.cc execution test > >> 20_util/scoped_allocator/2.cc execution test > >> 20_util/scoped_allocator/construct_pair_c++2a.cc execution test > >> 20_util/to_address/debug.cc execution test > >> 20_util/variant/run.cc execution test > > > >I think this is a latent bug in the static initialization code for > >EABI that affects big endian. In libstdc++-v3/libsupc++/guard.cc we > >have: > > > ># ifndef _GLIBCXX_GUARD_TEST_AND_ACQUIRE > > > >// Test the guard variable with a memory load with > >// acquire semantics. > > > >inline bool > >__test_and_acquire (__cxxabiv1::__guard *g) > >{ > > unsigned char __c; > > unsigned char *__p = reinterpret_cast<unsigned char *>(g); > > __atomic_load (__p, &__c, __ATOMIC_ACQUIRE); > > (void) __p; > > return _GLIBCXX_GUARD_TEST(&__c); > >} > ># define _GLIBCXX_GUARD_TEST_AND_ACQUIRE(G) __test_and_acquire (G) > ># endif > > > >That inspects the first byte of the guard variable. But for EABI the > >"is initialized" bit is the least significant bit of the guard > >variable. For little endian that's fine, the least significant bit is > >in the first byte. But for big endian, it's not in the first byte, so > >we are looking in the wrong place. This means that the initial check > >in __cxa_guard_acquire is wrong: > > > > extern "C" > > int __cxa_guard_acquire (__guard *g) > > { > >#ifdef __GTHREADS > > // If the target can reorder loads, we need to insert a read memory > > // barrier so that accesses to the guarded variable happen after the > > // guard test. > > if (_GLIBCXX_GUARD_TEST_AND_ACQUIRE (g)) > > return 0; > > > >This will always be false for big endian EABI, which means we run the > >rest of the function even when the variable is initialized. Previously > >that still gave the right answer, but inefficiently. After my change > >it gives the wrong answer, because the new code assumes that the check > >right at the start of __cxa_guard_acquire ACTUALLY WORKS. Silly me. > > > >I think the attached should fix it. This should be backported to fix > >inefficient code for static init on big endian EABI targets. > > > Committed as r11-3572, but I haven't been able to test it on a big > endian EABI system, only little endian EABI. > Hi Jonathan,
unfortunately, this breaks the build even for little-endian for me: 9980570_1.tmpdir/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-arm-none-linux-gnueabi/gcc3/arm-none-linux-gnueabi/libstdc++-v3/include -I/tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++ -D_GLIBCXX_SHARED -fno-implicit-templates -Wall -Wextra -Wwrite-strings -Wcast-qual -Wabi=2 -fdiagnostics-show-location=once -ffunction-sections -fdata-sections -frandom-seed=guard.lo -g -O2 -D_GNU_SOURCE -c /tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++/guard.cc -fPIC -DPIC -D_GLIBCXX_SHARED -o guard.o In file included from /tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++/cxxabi.h:50, from /tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++/guard.cc:28: /tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++/guard.cc: In function 'int __cxxabiv1::__cxa_guard_acquire(__cxxabiv1::__guard*)': /tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++/guard.cc:249:9: error: invalid type argument of unary '*' (have 'int') 249 | if (_GLIBCXX_GUARD_TEST_AND_ACQUIRE (g)) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ make[4]: *** [Makefile:763: guard.lo] Error 1 How did you test it? Maybe I'm missing something? Thanks, Christophe