* Linus Torvalds <torva...@linux-foundation.org> wrote: > On Mon, Apr 8, 2013 at 5:42 AM, Ingo Molnar <mi...@kernel.org> wrote: > > > > AFAICS the main performance trade-off is the following: when the owner CPU > > unlocks > > the mutex, we'll poll it via a read first, which turns the cacheline into > > shared-read MESI state. Then we notice that its content signals 'lock is > > available', and we attempt the trylock again. > > > > This increases lock latency in the few-contended-tasks case slightly - and > > we'd > > like to know by precisely how much, not just for a generic '10-100 users' > > case > > which does not tell much about the contention level. > > We had this problem for *some* lock where we used a "read + cmpxchg" in the > hotpath and it caused us problems due to two cacheline state transitions > (first > to shared, then to exclusive). It was faster to just assume it was unlocked > and > try to do an immediate cmpxchg. > > But iirc it is a non-issue for this case, because this is only about the > contended slow path. > > I forget where we saw the case where we should *not* read the initial value, > though. Anybody remember?
I had this vague recollection too - and some digging suggests that it might have been this discussion on lkml about 3 years ago: [RFC][PATCH 6/8] mm: handle_speculative_fault() These numbers PeterZ ran: http://lkml.indiana.edu/hypermail/linux/kernel/1001.1/00170.html Appear to show such an effect, on a smaller NUMA system. ( But I'm quite sure it came up somewhere else as well, just cannot place it. Probabilistic biological search indices are annoying.) Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/