On Tue, May 14, 2019 at 9:56 AM Peter Zijlstra <pet...@infradead.org> wrote: > > Understood; the problem is that "*p++" is not good enough for local_t > either (on load-store architectures), since it needs to be "atomic" wrt > all other instructions on that CPU, most notably exceptions.
Right. But I don't think that's the issue here, since if it was then it would be a problem even on UP. And while the CPU-local ones want atomicity, they *shouldn't* have the issue of another CPU modifying them, so even if you were to lose exclusive ownership of the cacheline (because some other CPU is reading your per-cpu data for statistics of whatever), the final end result should be fine. End result: I suspect ll/sc still works for cpu-local stuff without any extra loongson hacks. But I agree that it would be good to verify. Linus