On 01/10/20 08:50 +0100, Jonathan Wakely wrote:
On 01/10/20 09:30 +0200, Christophe Lyon via Libstdc++ wrote:
On Wed, 30 Sep 2020 at 22:44, Jonathan Wakely <jwak...@redhat.com> wrote:
On 30/09/20 16:03 +0100, Jonathan Wakely wrote:
On 29/09/20 13:51 +0200, Christophe Lyon via Libstdc++ wrote:
On Sat, 26 Sep 2020 at 21:42, Jonathan Wakely via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
Glibc 2.32 adds a global variable that says whether the process is
single-threaded. We can use this to decide whether to elide atomic
operations, as a more precise and reliable indicator than
__gthread_active_p.
This means that guard variables for statics and reference counting in
shared_ptr can use less expensive, non-atomic ops even in processes that
are linked to libpthread, as long as no threads have been created yet.
It also means that we switch to using atomics if libpthread gets loaded
later via dlopen (this still isn't supported in general, for other
reasons).
We can't use __libc_single_threaded to replace __gthread_active_p
everywhere. If we replaced the uses of __gthread_active_p in std::mutex
then we would elide the pthread_mutex_lock in the code below, but not
the pthread_mutex_unlock:
std::mutex m;
m.lock(); // pthread_mutex_lock
std::thread t([]{}); // __libc_single_threaded = false
t.join();
m.unlock(); // pthread_mutex_unlock
We need the lock and unlock to use the same "is threading enabled"
predicate, and similarly for init/destroy pairs for mutexes and
condition variables, so that we don't try to release resources that were
never acquired.
There are other places that could use __libc_single_threaded, such as
_Sp_locker in src/c++11/shared_ptr.cc and locale init functions, but
they can be changed later.
libstdc++-v3/ChangeLog:
PR libstdc++/96817
* include/ext/atomicity.h (__gnu_cxx::__is_single_threaded()):
New function wrapping __libc_single_threaded if available.
(__exchange_and_add_dispatch, __atomic_add_dispatch): Use it.
* libsupc++/guard.cc (__cxa_guard_acquire, __cxa_guard_abort)
(__cxa_guard_release): Likewise.
* testsuite/18_support/96817.cc: New test.
Tested powerpc64le-linux, with glibc 2.31 and 2.32. Committed to trunk.
Hi,
This patch introduced regressions on armeb-linux-gnueabhf:
--target armeb-none-linux-gnueabihf --with-cpu cortex-a9
g++.dg/compat/init/init-ref2 cp_compat_x_tst.o-cp_compat_y_tst.o execute
g++.dg/cpp2a/decomp1.C -std=gnu++14 execution test
g++.dg/cpp2a/decomp1.C -std=gnu++17 execution test
g++.dg/cpp2a/decomp1.C -std=gnu++2a execution test
g++.dg/init/init-ref2.C -std=c++14 execution test
g++.dg/init/init-ref2.C -std=c++17 execution test
g++.dg/init/init-ref2.C -std=c++2a execution test
g++.dg/init/init-ref2.C -std=c++98 execution test
g++.dg/init/ref15.C -std=c++14 execution test
g++.dg/init/ref15.C -std=c++17 execution test
g++.dg/init/ref15.C -std=c++2a execution test
g++.dg/init/ref15.C -std=c++98 execution test
g++.old-deja/g++.jason/pmf7.C -std=c++98 execution test
g++.old-deja/g++.mike/leak1.C -std=c++14 execution test
g++.old-deja/g++.mike/leak1.C -std=c++17 execution test
g++.old-deja/g++.mike/leak1.C -std=c++2a execution test
g++.old-deja/g++.mike/leak1.C -std=c++98 execution test
g++.old-deja/g++.other/init19.C -std=c++14 execution test
g++.old-deja/g++.other/init19.C -std=c++17 execution test
g++.old-deja/g++.other/init19.C -std=c++2a execution test
g++.old-deja/g++.other/init19.C -std=c++98 execution test
and probably some (280) in libstdc++ tests: (I didn't bisect those):
19_diagnostics/error_category/generic_category.cc execution test
19_diagnostics/error_category/system_category.cc execution test
20_util/scoped_allocator/1.cc execution test
20_util/scoped_allocator/2.cc execution test
20_util/scoped_allocator/construct_pair_c++2a.cc execution test
20_util/to_address/debug.cc execution test
20_util/variant/run.cc execution test
I think this is a latent bug in the static initialization code for
EABI that affects big endian. In libstdc++-v3/libsupc++/guard.cc we
have:
# ifndef _GLIBCXX_GUARD_TEST_AND_ACQUIRE
// Test the guard variable with a memory load with
// acquire semantics.
inline bool
__test_and_acquire (__cxxabiv1::__guard *g)
{
unsigned char __c;
unsigned char *__p = reinterpret_cast<unsigned char *>(g);
__atomic_load (__p, &__c, __ATOMIC_ACQUIRE);
(void) __p;
return _GLIBCXX_GUARD_TEST(&__c);
}
# define _GLIBCXX_GUARD_TEST_AND_ACQUIRE(G) __test_and_acquire (G)
# endif
That inspects the first byte of the guard variable. But for EABI the
"is initialized" bit is the least significant bit of the guard
variable. For little endian that's fine, the least significant bit is
in the first byte. But for big endian, it's not in the first byte, so
we are looking in the wrong place. This means that the initial check
in __cxa_guard_acquire is wrong:
extern "C"
int __cxa_guard_acquire (__guard *g)
{
#ifdef __GTHREADS
// If the target can reorder loads, we need to insert a read memory
// barrier so that accesses to the guarded variable happen after the
// guard test.
if (_GLIBCXX_GUARD_TEST_AND_ACQUIRE (g))
return 0;
This will always be false for big endian EABI, which means we run the
rest of the function even when the variable is initialized. Previously
that still gave the right answer, but inefficiently. After my change
it gives the wrong answer, because the new code assumes that the check
right at the start of __cxa_guard_acquire ACTUALLY WORKS. Silly me.
I think the attached should fix it. This should be backported to fix
inefficient code for static init on big endian EABI targets.
Committed as r11-3572, but I haven't been able to test it on a big
endian EABI system, only little endian EABI.
Hi Jonathan,
unfortunately, this breaks the build even for little-endian for me:
9980570_1.tmpdir/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-arm-none-linux-gnueabi/gcc3/arm-none-linux-gnueabi/libstdc++-v3/include
-I/tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++
-D_GLIBCXX_SHARED -fno-implicit-templates -Wall -Wextra
-Wwrite-strings -Wcast-qual -Wabi=2 -fdiagnostics-show-location=once
-ffunction-sections -fdata-sections -frandom-seed=guard.lo -g -O2
-D_GNU_SOURCE -c
/tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++/guard.cc
-fPIC -DPIC -D_GLIBCXX_SHARED -o guard.o
In file included from
/tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++/cxxabi.h:50,
from
/tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++/guard.cc:28:
/tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++/guard.cc:
In function 'int
__cxxabiv1::__cxa_guard_acquire(__cxxabiv1::__guard*)':
/tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++/guard.cc:249:9:
error: invalid type argument of unary '*' (have 'int')
249 | if (_GLIBCXX_GUARD_TEST_AND_ACQUIRE (g))
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
make[4]: *** [Makefile:763: guard.lo] Error 1
How did you test it? Maybe I'm missing something?
Apparently I didn't do a clean build and the guard.cc file didn't get
recompiled with the new header.
The attached works on armv7l-unknown-linux-gnueabihf, testing it now.
My testing was only partially successful because the machine I'm
testing on keeps locking up and needing a reboot half way through the
testsuite. But at least it compiles now, and I didn't see any
failures. Only tested on little endian EABI though.
Pushed to master as r11-3585.
commit 30c59cc6ce174a948d2968ce8d4fef6b55af1bbc
Author: Jonathan Wakely <jwak...@redhat.com>
Date: Thu Oct 1 08:45:02 2020
libstdc++: Fix test_and_acquire for EABI
libstdc++-v3/ChangeLog:
* config/cpu/arm/cxxabi_tweaks.h (_GLIBCXX_GUARD_TEST_AND_ACQUIRE):
Do not try to dereference return value of __atomic_load_n.
diff --git a/libstdc++-v3/config/cpu/arm/cxxabi_tweaks.h
b/libstdc++-v3/config/cpu/arm/cxxabi_tweaks.h
index 4fb34869f8a..a08afed7d21 100644
--- a/libstdc++-v3/config/cpu/arm/cxxabi_tweaks.h
+++ b/libstdc++-v3/config/cpu/arm/cxxabi_tweaks.h
@@ -48,7 +48,7 @@ namespace __cxxabiv1
typedef int __guard;
#define _GLIBCXX_GUARD_TEST_AND_ACQUIRE(x) \
- _GLIBCXX_GUARD_TEST(__atomic_load_n(x, __ATOMIC_ACQUIRE))
+ ((__atomic_load_n(x, __ATOMIC_ACQUIRE) & 1) != 0)
#define _GLIBCXX_GUARD_SET_AND_RELEASE(x) \
__atomic_store_n(x, 1, __ATOMIC_RELEASE)