From: Rik van Riel <r...@redhat.com> It looks like all the call paths that lead to __acct_update_integrals already have irqs disabled, and __acct_update_integrals does not need to disable irqs itself.
This is very convenient since about half the CPU time left in this function was spent in local_irq_save alone. Performance of a microbenchmark that calls an invalid syscall ten million times in a row on a nohz_full CPU improves 21% vs. 4.5-rc1 with both the removal of divisions from __acct_update_integrals and this patch, with runtime dropping from 3.7 to 2.9 seconds. With these patches applied, the highest remaining cpu user in the trace is native_sched_clock, which is addressed in the next patch. Suggested-by: Peter Zijlstra <pet...@infradead.org> Signed-off-by: Rik van Riel <r...@redhat.com> --- kernel/tsacct.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/kernel/tsacct.c b/kernel/tsacct.c index 8908f8b1d26e..b2663d699a72 100644 --- a/kernel/tsacct.c +++ b/kernel/tsacct.c @@ -124,26 +124,22 @@ static void __acct_update_integrals(struct task_struct *tsk, cputime_t utime, cputime_t stime) { cputime_t time, dtime; - unsigned long flags; u64 delta; if (unlikely(!tsk->mm)) return; - local_irq_save(flags); time = stime + utime; dtime = time - tsk->acct_timexpd; delta = cputime_to_nsecs(dtime); if (delta < TICK_NSEC) - goto out; + return; tsk->acct_timexpd = time; /* The final unit will be Mbyte-usecs, see xacct_add_tsk */ tsk->acct_rss_mem1 += delta * get_mm_rss(tsk->mm) / 1024; tsk->acct_vm_mem1 += delta * tsk->mm->total_vm / 1024; -out: - local_irq_restore(flags); } /** -- 2.5.0