Hi, This patchset implements the idle hist feature which analyze reason of system idle. Sometimes I need to investigate what makes CPUs to go idle even though I have jobs to do. It may be due to I/O, waiting on lock or whatever.
To identify the reasons it only accounts events related to idle task. Also it shows callchains when entering to idle and time how long it's in the idle. Although it's not perfect, it works well to help finding the reasons. With --idle-hist option, only idle task's runtime is accounted to previous task (which makes the cpu go to idle). With --summary(-only) option, you can see the total idle stat by task or callchains. The example output looks like below: Idle-time summary comm parent sched-out idle-time min-idle avg-idle max-idle stddev migrations (count) (msec) (msec) (msec) (msec) % --------------------------------------------------------------------------------------------------------------------- rcu_preempt[7] 2 95 550.872 0.011 5.798 23.146 7.63 0 migration/1[16] 2 1 15.558 15.558 15.558 15.558 0.00 0 khugepaged[39] 2 1 3.062 3.062 3.062 3.062 0.00 0 kworker/0:1H[124] 2 2 4.728 0.611 2.364 4.116 74.12 0 systemd-journal[167] 1 1 4.510 4.510 4.510 4.510 0.00 0 kworker/u16:3[558] 2 13 74.737 0.080 5.749 12.960 21.96 0 irq/34-iwlwifi[628] 2 21 118.403 0.032 5.638 23.990 24.00 0 kworker/u17:0[673] 2 1 3.523 3.523 3.523 3.523 0.00 0 dbus-daemon[722] 1 1 6.743 6.743 6.743 6.743 0.00 0 ifplugd[741] 1 1 58.826 58.826 58.826 58.826 0.00 0 wpa_supplicant[1490] 1 1 13.302 13.302 13.302 13.302 0.00 0 wpa_actiond[1492] 1 2 4.064 0.168 2.032 3.896 91.72 0 dockerd[1500] 1 1 0.055 0.055 0.055 0.055 0.00 0 dockerd[1516/1500] 1500 1 5.732 5.732 5.732 5.732 0.00 0 dockerd[1571/1500] 1500 1 0.020 0.020 0.020 0.020 0.00 0 ... Idle stats: CPU 0 idle for 902.195 msec CPU 1 idle for 899.932 msec CPU 2 idle for 941.218 msec CPU 3 idle for 822.453 msec Idle stats by callchain: CPU 0: 902.195 msec Idle time (msec) Count Callchains ---------------- -------- -------------------------------------------------- 370.589 69 futex_wait_queue_me <- futex_wait <- do_futex <- sys_futex <- entry_SYSCALL_64_fastpath 178.799 17 worker_thread <- kthread <- ret_from_fork 128.352 17 schedule_timeout <- rcu_gp_kthread <- kthread <- ret_from_fork 125.111 19 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- poll_schedule_timeout <- do_select <- core_sys_select 71.599 50 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- poll_schedule_timeout <- do_sys_poll <- sys_poll 23.146 1 rcu_gp_kthread <- kthread <- ret_from_fork 4.510 1 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- ep_poll <- sys_epoll_wait <- do_syscall_64 0.085 1 schedule_hrtimeout_range_clock <- schedule_hrtimeout_range <- poll_schedule_timeout <- do_sys_poll <- do_restart_poll ... The patch 1-5 are cleanup and preparations so they can be applied separately. The patch 6-10 implements the -I/--idle-hist option. The code is available at 'perf/timehist-idle-v1' branch in my tree: git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git Any comments are welcome! Thanks, Namhyung Namhyung Kim (10): perf sched: Cleanup option processing perf tools: Introduce callchain_cursor__copy() perf sched timehist: Handle zero sample->tid properly perf sched timehist: Split is_idle_sample() perf sched timehist: Cleanup idle_max_cpu handling perf sched timehist: Introduce struct idle_time_data perf sched timehist: Save callchain when entering idle perf sched timehist: Skip non-idle events when necessary perf sched timehist: Add -I/--idle-hist option perf sched timehist: Show callchains for idle stat tools/perf/Documentation/perf-sched.txt | 4 + tools/perf/builtin-sched.c | 264 +++++++++++++++++++++++++++----- tools/perf/util/callchain.c | 27 ++++ tools/perf/util/callchain.h | 3 + 4 files changed, 256 insertions(+), 42 deletions(-) -- 2.10.1