(v5: address comments by Frederic & Peter, fix bug found by Eric) Running with nohz_full introduces a fair amount of overhead. Specifically, various things that are usually done from the timer interrupt are now done at syscall, irq, and guest entry and exit times.
However, some of the code that is called every single time has only ever worked at jiffy resolution. The code in __acct_update_integrals was also doing some unnecessary calculations. Getting rid of the unnecessary calculations, without changing any of the functionality in __acct_update_integrals gets us about an 11% win. Not calling the time statistics updating code more than once per jiffy, like is done on housekeeping CPUs and on all the CPUs of a non-nohz_full system, shaves off a further 30%. I tested this series with a microbenchmark calling an invalid syscall number ten million times in a row, on a nohz_full cpu. Run times for the microbenchmark: 4.4 3.8 seconds 4.5-rc1 3.7 seconds 4.5-rc1 + first patch 3.3 seconds 4.5-rc1 + first 3 patches 3.1 seconds 4.5-rc1 + all patches 2.3 seconds Same test on a non-NOHZ_FULL, non-housekeeping CPU: all kernels 1.86 seconds