Chad Attermann wrote:
Hello all. Late last year I posted a couple of questions about
multi-threaded application hangs in Solaris 10 for x86 platforms, and
about thread-safety of std::basic_string in general. This was an
attempt to solve persistent problems I have been experiencing with my
application hanging due to CPU utilization shooting to 100%, with the
__gnu_cxx::__exchange_and_add function frequently making appearances at
the top of the stack trace of several threads.
I believe I have made a break-through recently and wanted to solicit the
opinion of some experts on this. I seem to have narrowed the problem
down to running my application as root versus an unprivileged user, and
further isolated the suspected cause to varying thread priorities in my
application. I have theorized that spin-locks in gcc, particularly in
the atomicity __gnu_cxx::__exchange_and_add function, are causing higher
priority threads to consume all available cpu cycles while spinning
indefinitely waiting for a lower priority thread that holds the lock.
Now I am already aware that messing with thread priorities is dangerous
and often an excercise in futility, but I am surprised that something so
elemental as an atomic test-and-set operation that may be used
extensively throughout gcc could possibly be the culprit for all of the
trouble I have been experiencing.
More than anything I'm hoping for a sanity check on this, even if it's
just to confirm what may be obvious to others; that modifying thread
priorities is strictly off-limits except in extreme circumstances with
careful control over what operations are performed. Or perhaps there's
another solution that has eluded my searches, maybe a bug fix or some
way of avoiding such spin-locks in gcc making varying thread-priorities
viable and safe.
Thanks in advance for any insight, and at the very least I hope that
this will serve as a warning to others who might find themselves in the
same situation.
I wonder if you are seeing either priority inversion or a deadlock.
As always, a small testcase would be useful.
David Daney