On Fri, 2018-08-03 at 04:03:36 UTC, Reza Arbab wrote: > We've encountered a performance issue when multiple processors stress > {get,put}_mmio_atsd_reg(). These functions contend for mmio_atsd_usage, > an unsigned long used as a bitmask. > > The accesses to mmio_atsd_usage are done using test_and_set_bit_lock() > and clear_bit_unlock(). As implemented, both of these will require a > (successful) stwcx to that same cache line. > > What we end up with is thread A, attempting to unlock, being slowed by > other threads repeatedly attempting to lock. A's stwcx instructions fail > and retry because the memory reservation is lost every time a different > thread beats it to the punch. > > There may be a long-term way to fix this at a larger scale, but for now > resolve the immediate problem by gating our call to > test_and_set_bit_lock() with one to test_bit(), which is obviously > implemented without using a store. > > Signed-off-by: Reza Arbab <ar...@linux.ibm.com> > Acked-by: Alistair Popple <alist...@popple.id.au>
Applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/9eab9901b015f489199105c470de1f cheers