On Thu, Apr 07, 2016 at 09:43:33AM -0700, Andy Lutomirski wrote: > enter the critical section: > 1: > movq %[cpu], %%r12 > movq {address of counter for our cpu}, %%r13 > movq {some fresh value}, (%%r13) > cmpq %[cpu], %%r12 > jne 1b
This is inherently racy; your forgot the detail of 'some fresh value', but since you want to avoid collisions you really want an increment. But load-store archs cannot do that. Or rather, they need to do: load Rn, $event add Rn, Rn, 1 store $event, Rn But if they're preempted in the middle, two threads will collide and generate the _same_ increment. Comparing CPU numbers will not fix that.