Re: History question: Thread-safe profiling instrumentation

Andrew Pinski Mon, 22 Apr 2013 15:29:39 -0700

On Mon, Apr 22, 2013 at 3:19 PM, Andi Kleen <a...@firstfloor.org> wrote:
> Bill Schmidt <wschm...@linux.vnet.ibm.com> writes:
>>
>> My reason for asking involves a large heavily-threaded application that
>> is improved by feedback-directed optimization on some platforms, but not
>> on others.  One theory is that a defective profile is generated due to
>> counter dropouts from contention.  I'm somewhat skeptical about this
>> given that some platforms seem to do well with it, but it's possible.
>> I'm hopeful that knowing why the thread-safe profiling patch wasn't
>> implemented will give us more of a clue.
>
> Atomics are slower even single threaded. In any case you'll have a
> gigantic slowdown if there is contention. Better use per thread
> counters.


Actually it depends on the processor.  For an example on Octeon2, the
atomic addition is faster than non atomic addition as the atomic
instructions work on L2 rather than working going through L1 and then
to L2.  Basically the atomic addition of a counter does not have to
populate the L1 cache in this case.

Thanks,
Andrew Pinski


>
> -Andi
>
> --
> a...@linux.intel.com -- Speaking for myself only

Re: History question: Thread-safe profiling instrumentation

Reply via email to