> > What happens when a program reads from a 'volatile' variable > > at address xy in a multi-processor system?
Looking at the assembler code produced by GCC: GCC does not emit any barrier or similar instructions for reads or writes to 'volatile' variables. So, the loop in test_lock actually waits until the other CPU _happens_ to flush it cache. > Might be NUMA related? Indeed. Probably the CPUs in a NUMA system flush their caches to main memory not so frequently. Whereas when we use a lock, the thread library or kernel executes the appropriate cache flush instructions. Bruno