Hello I detected this issue as I was updating the patches to send to the mailing list. I have not created a bug report.
When the CAS operation fails and expected == guard_bit, __cxa_guard_acquire will return immediately indicating that the initialisation has already succeeded. However, it's missing the acquire barrier for the changes done on the other thread, to match the release barrier from __cxa_guard_release. That is: thread A thread B load.acq == 0 load.acq == 0 __cxa_guard_acquire __cxa_guard_acquire CAS(0 -> 256) success __cxa_guard_release store.rel(1) CAS(0 ->256) fails At this point, we must synchronise with the store-release from thread A. -- Thiago Macieira - thiago.macieira (AT) intel.com Software Architect - Intel Open Source Technology Center Intel Sweden AB - Registration Number: 556189-6027 Knarrarnäsgatan 15, 164 40 Kista, Stockholm, Sweden
2012-08-30 Thiago Macieira <thiago.macie...@intel.com> * libsupc++/guard.cc (__cxa_guard_acquire): must use acquire semantics in case of failure, to acquire changes done by the other thread --- libstdc++-v3/libsupc++/guard.cc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libstdc++-v3/libsupc++/guard.cc b/libstdc++-v3/libsupc++/guard.cc index 36352e7..73d7221 100644 --- a/libstdc++-v3/libsupc++/guard.cc +++ b/libstdc++-v3/libsupc++/guard.cc @@ -253,7 +253,7 @@ namespace __cxxabiv1 int expected(0); if (__atomic_compare_exchange_n(gi, &expected, pending_bit, false, __ATOMIC_ACQ_REL, - __ATOMIC_RELAXED)) + __ATOMIC_ACQUIRE)) { // This thread should do the initialization. return 1; -- 1.7.11.4