> > I just tested a fetch-and-swap+exp.backoff spinlock with usermode on a > > program that spawns N threads and each thread performs an 2**M atomic > > increments > > on the same variable. That is, a degenerate worst-case kind of contention. > > N varies from 1 to 64, and M=15 on all runs, 5 runs per experiment: > > > > http://imgur.com/XpYctyT > > With backoff, the per-access latency grows roughly linearly with the > > number of > > cores, i.e. this is scalable. The other two are clearly superlinear. > > Just tried MCS, CLH and ticket spinlocks (with and without backoff). > They take essentially forever for this (admittedly worst-case) test;
[snip interesting stuff] Yeah, fair spinlocks in userspace wasn't a smart suggestion. :) Paolo