Hi! On 2025-03-14T11:39:20+0100, I wrote: > As the first of a few patches to enable libstdc++ for GCN, nvptx targets, > [...]
> some more fine-tuning is to follow later on.) Any comments before I push the attached "GCN, nvptx libstdc++: Force use of '__atomic' builtins [PR119645]"? Jonathan, please put a sharp eye on the 'libstdc++-v3/acinclude.m4:GLIBCXX_ENABLE_LOCK_POLICY' change; to make sure this only affects GCN, nvptx, but nothing else. Grüße Thomas
>From 1d3278050f9560666f6debcd2ead711660bebd4e Mon Sep 17 00:00:00 2001 From: Thomas Schwinge <tschwi...@baylibre.com> Date: Sat, 5 Apr 2025 23:11:23 +0200 Subject: [PATCH] GCN, nvptx libstdc++: Force use of '__atomic' builtins [PR119645] For both GCN, nvptx, this gets rid of 'configure'-time: configure: WARNING: No native atomic operations are provided for this platform. configure: WARNING: They will be faked using a mutex. configure: WARNING: Performance of certain classes will degrade as a result. ..., and changes: -checking for lock policy for shared_ptr reference counts... mutex +checking for lock policy for shared_ptr reference counts... atomic That means, '[...]/[target]/libstdc++-v3/', 'Makefile's change: -ATOMICITY_SRCDIR = config/cpu/generic/atomicity_mutex +ATOMICITY_SRCDIR = config/cpu/generic/atomicity_builtins ..., and '[...]/[target]/libstdc++-v3/config.h' changes: /* Defined if shared_ptr reference counting should use atomic operations. */ -/* #undef HAVE_ATOMIC_LOCK_POLICY */ +#define HAVE_ATOMIC_LOCK_POLICY 1 /* Define if the compiler supports C++11 atomics. */ -/* #undef _GLIBCXX_ATOMIC_BUILTINS */ +#define _GLIBCXX_ATOMIC_BUILTINS 1 ..., and '[...]/[target]/libstdc++-v3/include/[target]/bits/c++config.h' changes: /* Defined if shared_ptr reference counting should use atomic operations. */ -/* #undef _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY */ +#define _GLIBCXX_HAVE_ATOMIC_LOCK_POLICY 1 /* Define if the compiler supports C++11 atomics. */ -/* #undef _GLIBCXX_ATOMIC_BUILTINS */ +#define _GLIBCXX_ATOMIC_BUILTINS 1 This means that '[...]/[target]/libstdc++-v3/libsupc++/atomicity.cc', '[...]/[target]/libstdc++-v3/libsupc++/atomicity.o' then uses atomic instructions for synchronization instead of C++ static local variables, which in turn for their guard variables, via 'libstdc++-v3/libsupc++/guard.cc', used 'libgcc/gthr.h' recursive mutexes, which currently are unsupported for GCN. For GCN, this turns ~500 libstdc++ execution test FAILs into PASSes, and also progresses: PASS: g++.dg/tree-ssa/pr20458.C -std=gnu++17 (test for excess errors) [-FAIL:-]{+PASS:+} g++.dg/tree-ssa/pr20458.C -std=gnu++17 execution test PASS: g++.dg/tree-ssa/pr20458.C -std=gnu++26 (test for excess errors) [-FAIL:-]{+PASS:+} g++.dg/tree-ssa/pr20458.C -std=gnu++26 execution test UNSUPPORTED: g++.dg/tree-ssa/pr20458.C -std=gnu++98: exception handling not supported (For nvptx, there is no effective change, due to other misconfiguration.) PR target/119645 libstdc++-v3/ * acinclude.m4 (GLIBCXX_ENABLE_LOCK_POLICY) [GCN, nvptx]: Hard-code results. * configure: Regenerate. * configure.host [GCN, nvptx] (atomicity_dir): Set to 'cpu/generic/atomicity_builtins'. --- libstdc++-v3/acinclude.m4 | 7 ++++--- libstdc++-v3/configure | 11 ++++++----- libstdc++-v3/configure.host | 11 +++++++++++ 3 files changed, 21 insertions(+), 8 deletions(-) diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4 index 912e85a46679..5b9f0c5ee89d 100644 --- a/libstdc++-v3/acinclude.m4 +++ b/libstdc++-v3/acinclude.m4 @@ -4007,10 +4007,11 @@ AC_DEFUN([GLIBCXX_ENABLE_LOCK_POLICY], [ dnl Why don't we check 8-byte CAS for sparc64, where _Atomic_word is long?! dnl New targets should only check for CAS for the _Atomic_word type. AC_TRY_COMPILE([ - #if defined __riscv + #if defined __AMDGCN__ || defined __nvptx__ + /* Yes, please. */ + #elif defined __riscv # error "Defaulting to mutex-based locks for ABI compatibility" - #endif - #if ! defined __GCC_HAVE_SYNC_COMPARE_AND_SWAP_2 + #elif ! defined __GCC_HAVE_SYNC_COMPARE_AND_SWAP_2 # error "No 2-byte compare-and-swap" #elif ! defined __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4 # error "No 4-byte compare-and-swap" diff --git a/libstdc++-v3/configure b/libstdc++-v3/configure index 0b4481f0739c..2b2eedeb2e71 100755 --- a/libstdc++-v3/configure +++ b/libstdc++-v3/configure @@ -16393,10 +16393,11 @@ ac_compiler_gnu=$ac_cv_cxx_compiler_gnu cat confdefs.h - <<_ACEOF >conftest.$ac_ext /* end confdefs.h. */ - #if defined __riscv + #if defined __AMDGCN__ || defined __nvptx__ + /* Yes, please. */ + #elif defined __riscv # error "Defaulting to mutex-based locks for ABI compatibility" - #endif - #if ! defined __GCC_HAVE_SYNC_COMPARE_AND_SWAP_2 + #elif ! defined __GCC_HAVE_SYNC_COMPARE_AND_SWAP_2 # error "No 2-byte compare-and-swap" #elif ! defined __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4 # error "No 4-byte compare-and-swap" @@ -16443,7 +16444,7 @@ $as_echo "mutex" >&6; } # unnecessary for this test. cat > conftest.$ac_ext << EOF -#line 16446 "configure" +#line 16447 "configure" int main() { _Decimal32 d1; @@ -16485,7 +16486,7 @@ ac_compiler_gnu=$ac_cv_cxx_compiler_gnu # unnecessary for this test. cat > conftest.$ac_ext << EOF -#line 16488 "configure" +#line 16489 "configure" template<typename T1, typename T2> struct same { typedef T2 type; }; diff --git a/libstdc++-v3/configure.host b/libstdc++-v3/configure.host index 8375764bf4dc..05ba8fc797c7 100644 --- a/libstdc++-v3/configure.host +++ b/libstdc++-v3/configure.host @@ -370,10 +370,21 @@ case "${host}" in ;; esac ;; + amdgcn-*-amdhsa) + # To avoid greater pain elsewhere, force use of '__atomic' builtins, + # irregardless of outcome of 'configure' checks; see PR119645 + # "GCN, nvptx: libstdc++ 'checking for atomic builtins [...]... no'". + atomicity_dir=cpu/generic/atomicity_builtins + ;; arm*-*-freebsd*) port_specific_symbol_files="\$(srcdir)/../config/os/gnu-linux/arm-eabi-extra.ver" ;; nvptx-*-none) + # To avoid greater pain elsewhere, force use of '__atomic' builtins, + # irregardless of outcome of 'configure' checks; see PR119645 + # "GCN, nvptx: libstdc++ 'checking for atomic builtins [...]... no'". + atomicity_dir=cpu/generic/atomicity_builtins + # For 'make all-target-libstdc++-v3', re 'alloca'/VLA usage: EXTRA_CFLAGS="${EXTRA_CFLAGS} -mfake-ptx-alloca" OPTIMIZE_CXXFLAGS="${OPTIMIZE_CXXFLAGS} -mfake-ptx-alloca" -- 2.34.1