http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50065
--- Comment #8 from Zhangxi Tan <tanzhangxi at gmail dot com> 2011-08-14 21:00:40 UTC --- Thanks for the clear explanation. I agree that a memory barrier would solve this issue. Regarding the spinlock_unlock in linux, the regular arch_spin_unlock is implemented with a single inline assembly. That will prevent the memory reordering in C. However, for the 32-bit port the arch_write_unlock is still defined as the following without a memory barrier in arch/sparc/include/asm/spinlock_32.h #define arch_write_unlock(rw) do { (rw)->lock = 0; } while(0) OTH, the 64-bit implemention is ok. Or did I miss something here. Anyway, I think this is a separated issue from this thread. (In reply to comment #6) > > The code is equivalent to > > > > volatile unsigned char lock; > > int remap_barrier; > > > > while (atomic_test_and_set(lock)) { > > while (lock) { > > ; > > } > > } > > remap_barrier++; > > lock = 0; > > > > Eric: could you let me know you you think the code inside function > > spinlock_lock(&remap_lock) is a NOP? > > I don't, you simply misquoted, I wrote "the end of the code". The first part > of the spinlock implementation is correct, in particular you have the required > memory barrier in spinlock_is_locked. The second part is not correct, as you > don't have the memory barrier in spinlock_unlock. > > > Also, the arch_write_lock/unlock in the SPARC port of Linux uses a very > > similar implementation. > > No, it precisely doesn't, it has the memory barrier in spinlock_unlock.