On Mon, Mar 2, 2015 at 10:42 AM, Jason Low <jason.l...@hp.com> wrote: > > This patch converts the timers to 64 bit atomic variables and use > atomic add to update them without a lock. With this patch, the percent > of total time spent updating thread group cputimer timers was reduced > from 30% down to less than 1%.
NAK. Not because I think this is wrong, but because somebody needs to look at the effects on 32-bit architectures too. In particular, check out lib/atomic64.c - which uses a hashed array of 16-bit spinlocks to do 64-bit atomics. That may or may well work ok in practice, but it does mean that now sample_group_cputimer() and update_gt_cputime() will take that (it ends up generally being the same) spinlock three times for the three atomic64_read()'s. Now, I think on x86, we end up using not lib/atomic64.c but our own versions that use cmpxchg8b, which is probably fine from a performance standpoint. But I see a lot of "select GENERIC_ATOMIC64" for other architectures. Anyway, it is *possible* that even on those 32-bit targets, the atomic64's aren't any worse than the current spinlock in practice. So the "NAK" is in no way absolute - but I'd just like to hear that this is all reasonably fine on 32-bit ARM and powerpc, for example. Hmm? Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/