> On May 24, 2020, at 02:10, Dominik Pantůček <dominik.pantu...@trustica.cz> 
> wrote:
> 
> At first I was surprised that you are basically suggesting using
> spinlocks (busy-wait loops) instead of futex-backed (at least on Linux)
> fsemaphores. That is a waste of CPU time.

Performing CAS operations in a loop isn’t really directly comparable to 
spinlocking. With a spinlock, the blocked thread is spending cycles doing 
absolutely nothing for an arbitrarily-long time. The blocked thread is 
completely at the mercy of the thread that holds the lock, which may not 
release it for a lengthy duration even if the blocked thread only needs to 
acquire it briefly.

In contrast, lock-free concurrency approaches that use atomic operations never 
spend CPU time busy-waiting. All threads are always doing useful work; the 
worst case scenario is that work is duplicated. The burden here is reversed: if 
one thread needs to use the shared resource for a short period of time while a 
long-running computation is in progress on another thread, it doesn’t have to 
wait, it just wins and continues with its business. It’s the long-running 
computation that loses here, since it has to throw away the work it did and 
retry.

This means lock-free approaches are bad if you have (a) lots of contention over 
a shared resource, plus (b) long-running computations using the resource, which 
would be prohibitively-expensive to duplicate. But if contention is low and/or 
the operation is sufficiently cheap, the overhead of work duplication will be 
irrelevant. (This is why lock-free concurrency is sometimes called “optimistic 
concurrency”—you’re optimistically hoping nobody else needs the resource while 
you’re using it.) Furthermore, note that acquiring a mutex must ultimately use 
atomic operations internally, so if there is no contention, the lock-free 
approach will fundamentally be faster (or at least no slower) than the locking 
approach.

Now consider what blocking on a mutex must do in addition to the atomic 
operation: it must return to the thread scheduler, since a blocked thread must 
be added to a wakeup list and unscheduled. This incurs all the overhead of 
context-switching, which is small, but it isn’t zero. Compare that to the cost 
of a single enqueue or dequeue operation, which involve just a couple reads and 
a single write. Soon you end up in a situation where the cost of 
context-switching outweighs any work you might duplicate even if there is 
thread contention, and now you’re always beating the mutex under all usage 
patterns!

tl;dr: Even though lock-free approaches and spinlocks are superficially similar 
in that they involve “spinning in a loop,” it isn’t fair to conflate them! 
Optimistic concurrency is a bad idea if you have operations that need access to 
the shared resource for a long time, since you’ll end up duplicating tons of 
work, but if the operation is cheap, you lose nothing over a mutex-based 
approach and potentially gain a little extra performance.

Alexis

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/racket-users/C4901A78-33A7-40B7-AA20-6D5B504D9716%40gmail.com.

Reply via email to