On 01/10/20 09:30 +0200, Christophe Lyon via Libstdc++ wrote:
On Wed, 30 Sep 2020 at 22:44, Jonathan Wakely <jwak...@redhat.com> wrote:

On 30/09/20 16:03 +0100, Jonathan Wakely wrote:
>On 29/09/20 13:51 +0200, Christophe Lyon via Libstdc++ wrote:
>>On Sat, 26 Sep 2020 at 21:42, Jonathan Wakely via Gcc-patches
>><gcc-patches@gcc.gnu.org> wrote:
>>>
>>>Glibc 2.32 adds a global variable that says whether the process is
>>>single-threaded. We can use this to decide whether to elide atomic
>>>operations, as a more precise and reliable indicator than
>>>__gthread_active_p.
>>>
>>>This means that guard variables for statics and reference counting in
>>>shared_ptr can use less expensive, non-atomic ops even in processes that
>>>are linked to libpthread, as long as no threads have been created yet.
>>>It also means that we switch to using atomics if libpthread gets loaded
>>>later via dlopen (this still isn't supported in general, for other
>>>reasons).
>>>
>>>We can't use __libc_single_threaded to replace __gthread_active_p
>>>everywhere. If we replaced the uses of __gthread_active_p in std::mutex
>>>then we would elide the pthread_mutex_lock in the code below, but not
>>>the pthread_mutex_unlock:
>>>
>>>  std::mutex m;
>>>  m.lock();            // pthread_mutex_lock
>>>  std::thread t([]{}); // __libc_single_threaded = false
>>>  t.join();
>>>  m.unlock();          // pthread_mutex_unlock
>>>
>>>We need the lock and unlock to use the same "is threading enabled"
>>>predicate, and similarly for init/destroy pairs for mutexes and
>>>condition variables, so that we don't try to release resources that were
>>>never acquired.
>>>
>>>There are other places that could use __libc_single_threaded, such as
>>>_Sp_locker in src/c++11/shared_ptr.cc and locale init functions, but
>>>they can be changed later.
>>>
>>>libstdc++-v3/ChangeLog:
>>>
>>>        PR libstdc++/96817
>>>        * include/ext/atomicity.h (__gnu_cxx::__is_single_threaded()):
>>>        New function wrapping __libc_single_threaded if available.
>>>        (__exchange_and_add_dispatch, __atomic_add_dispatch): Use it.
>>>        * libsupc++/guard.cc (__cxa_guard_acquire, __cxa_guard_abort)
>>>        (__cxa_guard_release): Likewise.
>>>        * testsuite/18_support/96817.cc: New test.
>>>
>>>Tested powerpc64le-linux, with glibc 2.31 and 2.32. Committed to trunk.
>>
>>Hi,
>>
>>This patch introduced regressions on armeb-linux-gnueabhf:
>>--target armeb-none-linux-gnueabihf --with-cpu cortex-a9
>>   g++.dg/compat/init/init-ref2 cp_compat_x_tst.o-cp_compat_y_tst.o execute
>>   g++.dg/cpp2a/decomp1.C  -std=gnu++14 execution test
>>   g++.dg/cpp2a/decomp1.C  -std=gnu++17 execution test
>>   g++.dg/cpp2a/decomp1.C  -std=gnu++2a execution test
>>   g++.dg/init/init-ref2.C  -std=c++14 execution test
>>   g++.dg/init/init-ref2.C  -std=c++17 execution test
>>   g++.dg/init/init-ref2.C  -std=c++2a execution test
>>   g++.dg/init/init-ref2.C  -std=c++98 execution test
>>   g++.dg/init/ref15.C  -std=c++14 execution test
>>   g++.dg/init/ref15.C  -std=c++17 execution test
>>   g++.dg/init/ref15.C  -std=c++2a execution test
>>   g++.dg/init/ref15.C  -std=c++98 execution test
>>   g++.old-deja/g++.jason/pmf7.C  -std=c++98 execution test
>>   g++.old-deja/g++.mike/leak1.C  -std=c++14 execution test
>>   g++.old-deja/g++.mike/leak1.C  -std=c++17 execution test
>>   g++.old-deja/g++.mike/leak1.C  -std=c++2a execution test
>>   g++.old-deja/g++.mike/leak1.C  -std=c++98 execution test
>>   g++.old-deja/g++.other/init19.C  -std=c++14 execution test
>>   g++.old-deja/g++.other/init19.C  -std=c++17 execution test
>>   g++.old-deja/g++.other/init19.C  -std=c++2a execution test
>>   g++.old-deja/g++.other/init19.C  -std=c++98 execution test
>>
>>and probably some (280) in libstdc++ tests: (I didn't bisect those):
>>   19_diagnostics/error_category/generic_category.cc execution test
>>   19_diagnostics/error_category/system_category.cc execution test
>>   20_util/scoped_allocator/1.cc execution test
>>   20_util/scoped_allocator/2.cc execution test
>>   20_util/scoped_allocator/construct_pair_c++2a.cc execution test
>>   20_util/to_address/debug.cc execution test
>>   20_util/variant/run.cc execution test
>
>I think this is a latent bug in the static initialization code for
>EABI that affects big endian. In libstdc++-v3/libsupc++/guard.cc we
>have:
>
># ifndef _GLIBCXX_GUARD_TEST_AND_ACQUIRE
>
>// Test the guard variable with a memory load with
>// acquire semantics.
>
>inline bool
>__test_and_acquire (__cxxabiv1::__guard *g)
>{
>  unsigned char __c;
>  unsigned char *__p = reinterpret_cast<unsigned char *>(g);
>  __atomic_load (__p, &__c,  __ATOMIC_ACQUIRE);
>  (void) __p;
>  return _GLIBCXX_GUARD_TEST(&__c);
>}
>#  define _GLIBCXX_GUARD_TEST_AND_ACQUIRE(G) __test_and_acquire (G)
># endif
>
>That inspects the first byte of the guard variable. But for EABI the
>"is initialized" bit is the least significant bit of the guard
>variable. For little endian that's fine, the least significant bit is
>in the first byte. But for big endian, it's not in the first byte, so
>we are looking in the wrong place. This means that the initial check
>in __cxa_guard_acquire is wrong:
>
>  extern "C"
>  int __cxa_guard_acquire (__guard *g)
>  {
>#ifdef __GTHREADS
>    // If the target can reorder loads, we need to insert a read memory
>    // barrier so that accesses to the guarded variable happen after the
>    // guard test.
>    if (_GLIBCXX_GUARD_TEST_AND_ACQUIRE (g))
>      return 0;
>
>This will always be false for big endian EABI, which means we run the
>rest of the function even when the variable is initialized. Previously
>that still gave the right answer, but inefficiently. After my change
>it gives the wrong answer, because the new code assumes that the check
>right at the start of __cxa_guard_acquire ACTUALLY WORKS. Silly me.
>
>I think the attached should fix it. This should be backported to fix
>inefficient code for static init on big endian EABI targets.


Committed as r11-3572, but I haven't been able to test it on a big
endian EABI system, only little endian EABI.

Hi Jonathan,

unfortunately, this breaks the build even for little-endian for me:
9980570_1.tmpdir/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-arm-none-linux-gnueabi/gcc3/arm-none-linux-gnueabi/libstdc++-v3/include
-I/tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++
-D_GLIBCXX_SHARED -fno-implicit-templates -Wall -Wextra
-Wwrite-strings -Wcast-qual -Wabi=2 -fdiagnostics-show-location=once
-ffunction-sections -fdata-sections -frandom-seed=guard.lo -g -O2
-D_GNU_SOURCE -c
/tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++/guard.cc
-fPIC -DPIC -D_GLIBCXX_SHARED -o guard.o
In file included from
/tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++/cxxabi.h:50,
                from
/tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++/guard.cc:28:
/tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++/guard.cc:
In function 'int
__cxxabiv1::__cxa_guard_acquire(__cxxabiv1::__guard*)':
/tmp/9980570_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/libsupc++/guard.cc:249:9:
error: invalid type argument of unary '*' (have 'int')
 249 |     if (_GLIBCXX_GUARD_TEST_AND_ACQUIRE (g))
     |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
make[4]: *** [Makefile:763: guard.lo] Error 1

How did you test it? Maybe I'm missing something?

Apparently I didn't do a clean build and the guard.cc file didn't get
recompiled with the new header.

The attached works on armv7l-unknown-linux-gnueabihf, testing it now.


commit 30c59cc6ce174a948d2968ce8d4fef6b55af1bbc
Author: Jonathan Wakely <jwak...@redhat.com>
Date:   Thu Oct 1 08:45:02 2020

    libstdc++: Fix test_and_acquire for EABI
    
    libstdc++-v3/ChangeLog:
    
            * config/cpu/arm/cxxabi_tweaks.h (_GLIBCXX_GUARD_TEST_AND_ACQUIRE):
            Do not try to dereference return value of __atomic_load_n.

diff --git a/libstdc++-v3/config/cpu/arm/cxxabi_tweaks.h b/libstdc++-v3/config/cpu/arm/cxxabi_tweaks.h
index 4fb34869f8a..a08afed7d21 100644
--- a/libstdc++-v3/config/cpu/arm/cxxabi_tweaks.h
+++ b/libstdc++-v3/config/cpu/arm/cxxabi_tweaks.h
@@ -48,7 +48,7 @@ namespace __cxxabiv1
   typedef int __guard;
 
 #define _GLIBCXX_GUARD_TEST_AND_ACQUIRE(x) \
-  _GLIBCXX_GUARD_TEST(__atomic_load_n(x, __ATOMIC_ACQUIRE))
+  ((__atomic_load_n(x, __ATOMIC_ACQUIRE) & 1) != 0)
 #define _GLIBCXX_GUARD_SET_AND_RELEASE(x) \
   __atomic_store_n(x, 1, __ATOMIC_RELEASE)
 

Reply via email to