"Ian Lance Taylor" <[EMAIL PROTECTED]> writes:
"Chad Attermann" <[EMAIL PROTECTED]> writes:
Hello all. Late last year I posted a couple of questions about
multi-threaded application hangs in Solaris 10 for x86 platforms, and
about thread-safety of std::basic_string in general. This was an
attempt to solve persistent problems I have been experiencing with my
application hanging due to CPU utilization shooting to 100%, with the
__gnu_cxx::__exchange_and_add function frequently making appearances
at the top of the stack trace of several threads.
I believe I have made a break-through recently and wanted to solicit
the opinion of some experts on this. I seem to have narrowed the
problem down to running my application as root versus an unprivileged
user, and further isolated the suspected cause to varying thread
priorities in my application. I have theorized that spin-locks in gcc,
particularly in the atomicity __gnu_cxx::__exchange_and_add function,
are causing higher priority threads to consume all available cpu
cycles while spinning indefinitely waiting for a lower priority thread
that holds the lock. Now I am already aware that messing with thread
priorities is dangerous and often an excercise in futility, but I am
surprised that something so elemental as an atomic test-and-set
operation that may be used extensively throughout gcc could possibly
be the culprit for all of the trouble I have been experiencing.
You explicitly mentioned x86. For x86, __gnu_cxx::__exchange_and_add
does not use a spin-lock.
If you mean that other code may use spin locks built on top of
__exchange_and_add, then, yes, in that case you could be getting a
priority inversion. But gcc itself does not use any such code. So if
you are seeing a problem of this sort, it is not a problem with gcc.
I doubted at first too, but from what I can tell, the version of gcc that
ships with Solaris 10 x86 is gcc 3.4.3 for i386. Below is the output of the
pre-installed gcc given the -v switch:
Reading specs from /usr/sfw/lib/gcc/i386-pc-solaris2.10/3.4.3/specs
Configured with:
/builds/sfw10-gate/usr/src/cmd/gcc/gcc-3.4.3/configure --prefix=/usr/sfw --with-as=/usr/sfw/bin/gas
--with-gnu-as --with-ld=/usr/ccs/bin/ld --without-gnu-ld --enable-languages=c,c++
--enable-shared
Thread model: posix
gcc version 3.4.3 (csl-sol210-3_4-branch+sol_rpath)
Below is a snippet of code from gcc 3.4.3 source
./libstdc++-v3/config/cpu/i386/atomicity.h:
_Atomic_word
__attribute__ ((__unused__))
__exchange_and_add(volatile _Atomic_word* __mem, int __val)
{
register _Atomic_word __result, __tmp = 1;
// Obtain the atomic exchange/add spin lock.
do
{
__asm__ __volatile__ ("xchg{l} {%0,%1|%1,%0}"
: "=m"
(_Atomicity_lock<0>::_S_atomicity_lock),
"+r" (__tmp)
: "m"
(_Atomicity_lock<0>::_S_atomicity_lock));
}
while (__tmp);
__result = *__mem;
*__mem += __val;
// Release spin lock.
_Atomicity_lock<0>::_S_atomicity_lock = 0;
return __result;
}
I can not confirm that it was the i386 code included in the gcc build but it
appears that way from the signature. Is this perhaps a problem with the way
that gcc 3.4.3 shipping with Solaris 10 x86 was built? Should it have opted
for the i486 version instead that does not use spin-locks?