On Wed, Jun 22, 2016 at 10:25:48PM -0400, r...@redhat.com wrote: > From: Rik van Riel <r...@redhat.com> > > The CONFIG_VIRT_CPU_ACCOUNTING_GEN irq time tracking code does not > appear to currently work right. > > On CPUs that are nohz_full, people typically do not assign IRQs.
Right, but they can still fire. At least one tick per second, plus the pinned timers, etc... > > On the housekeeping CPU (when a system is booted up with nohz_full), > sampling should work ok to determine irq and softirq time use, but > that only covers the housekeeping CPU itself, not the other > non-nohz_full CPUs. Hmm, every non-nohz_full CPUs, including the CPU 0, account the irqtime the same way: through the tick (and therefore can't account much of it). So I'm a bit confused by the above statements.. > > On CPUs that are nohz_idle (the typical way a distro kernel is > booted), irq time is not accounted at all while the CPU is idle, > due to the lack of timer ticks. But as soon as a timer tick fires in idle or afterward, the pending irqtime is accounted. That said I don't see how it explains why we do the below: > > Remove the VTIME_GEN vtime irq time code. The next patch will > allow NO_HZ_FULL kernels to use the IRQ_TIME_ACCOUNTING code. I don't get the reason why we are doing this. Now arguably the irqtime accounting is probably not working as well as before since we switched to jiffy clock. But I still see some hard irqs accounted when account_irq_exit() is lucky enough to observe that jiffies changed since the beginning of the interrupt. So it's not entirely broken. I agree that we need to switch it to the generic irqtime accounting code but breaking the code now to reactivate it in a subsequent patch is prone to future bisection issues. Thanks.