* Waiman Long wrote:
> BTW, I have also been thinking about extracting the spinlock out from the
> mutex
> structure for some busy mutex by adding a pointer to an external auxiliary
> structure (separately allocated at init time). The idea is to use the
> external
> spinlock if available. O
On 04/10/2013 01:16 PM, Ingo Molnar wrote:
* Waiman Long wrote:
On 04/10/2013 06:31 AM, Ingo Molnar wrote:
* Waiman Long wrote:
That said, the MUTEX_SHOULD_XCHG_COUNT macro should die. Why shouldn't all
architectures just consider negative counts to be locked? It doesn't matter
that some
* Waiman Long wrote:
> On 04/10/2013 06:31 AM, Ingo Molnar wrote:
> >* Waiman Long wrote:
> >
> >>>That said, the MUTEX_SHOULD_XCHG_COUNT macro should die. Why shouldn't all
> >>>architectures just consider negative counts to be locked? It doesn't matter
> >>>that some might only ever see -1.
>
On 04/10/2013 06:31 AM, Ingo Molnar wrote:
* Waiman Long wrote:
That said, the MUTEX_SHOULD_XCHG_COUNT macro should die. Why shouldn't all
architectures just consider negative counts to be locked? It doesn't matter
that some might only ever see -1.
I think so too. However, I don't have the ma
On 04/10/2013 06:28 AM, Ingo Molnar wrote:
* Waiman Long wrote:
Furthermore, since you are seeing this effect so profoundly, have you
considered using another approach, such as queueing all the poll-waiters in
some fashion?
That would optimize your workload additionally: removing the 'stamped
On Wed, Apr 10, 2013 at 7:09 AM, Robin Holt wrote:
> On Mon, Apr 08, 2013 at 07:38:39AM -0700, Linus Torvalds wrote:
>>
>> I forget where we saw the case where we should *not* read the initial
>> value, though. Anybody remember?
>
> I think you might be remembering ia64. Fairly early on, I recall
On Mon, Apr 08, 2013 at 07:38:39AM -0700, Linus Torvalds wrote:
> On Mon, Apr 8, 2013 at 5:42 AM, Ingo Molnar wrote:
> >
> > AFAICS the main performance trade-off is the following: when the owner CPU
> > unlocks
> > the mutex, we'll poll it via a read first, which turns the cacheline into
> > sha
* Waiman Long wrote:
> > That said, the MUTEX_SHOULD_XCHG_COUNT macro should die. Why shouldn't all
> > architectures just consider negative counts to be locked? It doesn't matter
> > that some might only ever see -1.
>
> I think so too. However, I don't have the machines to test out other
>
* Waiman Long wrote:
> > Furthermore, since you are seeing this effect so profoundly, have you
> > considered using another approach, such as queueing all the poll-waiters in
> > some fashion?
> >
> > That would optimize your workload additionally: removing the 'stampede' of
> > trylock attem
On 04/08/2013 10:38 AM, Linus Torvalds wrote:
On Mon, Apr 8, 2013 at 5:42 AM, Ingo Molnar wrote:
AFAICS the main performance trade-off is the following: when the owner CPU
unlocks
the mutex, we'll poll it via a read first, which turns the cacheline into
shared-read MESI state. Then we notice t
On 04/08/2013 08:42 AM, Ingo Molnar wrote:
* Waiman Long wrote:
In the __mutex_lock_common() function, an initial entry into
the lock slow path will cause two atomic_xchg instructions to be
issued. Together with the atomic decrement in the fast path, a total
of three atomic read-modify-write i
* Linus Torvalds wrote:
> On Mon, Apr 8, 2013 at 5:42 AM, Ingo Molnar wrote:
> >
> > AFAICS the main performance trade-off is the following: when the owner CPU
> > unlocks
> > the mutex, we'll poll it via a read first, which turns the cacheline into
> > shared-read MESI state. Then we notice t
On Mon, Apr 8, 2013 at 5:42 AM, Ingo Molnar wrote:
>
> AFAICS the main performance trade-off is the following: when the owner CPU
> unlocks
> the mutex, we'll poll it via a read first, which turns the cacheline into
> shared-read MESI state. Then we notice that its content signals 'lock is
> avai
* Waiman Long wrote:
> In the __mutex_lock_common() function, an initial entry into
> the lock slow path will cause two atomic_xchg instructions to be
> issued. Together with the atomic decrement in the fast path, a total
> of three atomic read-modify-write instructions will be issued in
> rapid
In the __mutex_lock_common() function, an initial entry into
the lock slow path will cause two atomic_xchg instructions to be
issued. Together with the atomic decrement in the fast path, a total
of three atomic read-modify-write instructions will be issued in
rapid succession. This can cause a lot
15 matches
Mail list logo