On Fri, 23 Mar 2007 09:59:11 +0100 Nick Piggin <[EMAIL PROTECTED]> wrote:
> > Implement queued spinlocks for i386. This shouldn't increase the size of > the spinlock structure, while still able to handle 2^16 CPUs. > > Not completely implemented with assembly yet, to make the algorithm a bit > clearer. > > The queued spinlock has 2 fields, a head and a tail, which are indexes > into a FIFO of waiting CPUs. To take a spinlock, a CPU performs an > "atomic_inc_return" on the head index, and keeps the returned value as > a ticket. The CPU then spins until the tail index is equal to that > ticket. > > To unlock a spinlock, the tail index is incremented (this can be non > atomic, because only the lock owner will modify tail). > > Implementation inefficiencies aside, this change should have little > effect on performance for uncontended locks, but will have quite a > large cost for highly contended locks [O(N) cacheline transfers vs > O(1) per lock aquisition, where N is the number of CPUs contending]. > The benefit is is that contended locks will not cause any starvation. > > Just an idea. Big NUMA hardware seems to have fairness logic that > prevents starvation for the regular spinlock logic. But it might be > interesting for -rt kernel or systems with starvation issues. It's a very nice idea Nick. You also have for free the number or cpus that are before you. On big SMP/NUMA, we could use this information to call a special lock_cpu_relax() function to avoid cacheline transferts. asm volatile(LOCK_PREFIX "xaddw %0, %1\n\t" : "+r" (pos), "+m" (lock->qhead) : : "memory"); for (;;) { unsigned short nwait = pos - lock->qtail; if (likely(nwait == 0)) break; lock_cpu_relax(lock, nwait); } lock_cpu_relax(raw_spinlock_t *lock, unsigned int nwait) { unsigned int cycles = nwait * lock->min_cycles_per_round; busy_loop(cycles); } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/