On Wed, Aug 31, 2016 at 07:28:18AM +1000, Benjamin Herrenschmidt wrote: > On powerpc we have a sync deep in _switch to achieve that.
OK, for giggles, could you (or Balbir) check what happens if you take that sync out? There should be enough serialization in the generic code to cover the case that code mentions. ARM64 has a stronger barrier in its context switch code, but that's because they need to sync against external agents (like their TLB and cache) and no amount of generic locking is going to cover that.