On 04/08/21 22:53, Dongli Zhang wrote: > During bootup or cpu hotplug, the cpuhp_up_callbacks() or > cpuhp_down_callbacks() call many CPUHP callbacks (e.g., perf, mm, > workqueue, RCU, kvmclock and more) for each cpu to online/offline. It may > roll back to its previous state if any of callbacks is failed. As a result, > the user will not be able to know which callback is failed and usually the > only symptom is cpu online/offline failure. > > This patch is to print more debug log to help user narrow down where is the > root cause. > > Below is the example that how the patch helps narrow down the root cause > for the issue fixed by commit d7eb79c6290c ("KVM: kvmclock: Fix vCPUs > 64 > can't be online/hotpluged"). > > We will have below dynamic debug log once we add > dyndbg="file kernel/cpu.c +p" to kernel command line and when issue is > reproduced.
You can also enable it at runtime echo "file kernel/cpu.c +p" > /sys/kernel/debug/dynamic_debug/control > > "CPUHP up callback failure (-12) for cpu 64 at kvmclock:setup_percpu (66)" > > Cc: Joe Jin <joe....@oracle.com> > Signed-off-by: Dongli Zhang <dongli.zh...@oracle.com> > --- I don't see the harm in adding the debug if some find it useful. FWIW Reviewed-by: Qais Yousef <qais.you...@arm.com> Cheers -- Qais Yousef > Changed since v1 RFC: > - use pr_debug() but not pr_err_once() (suggested by Qais Yousef) > - print log for cpuhp_down_callbacks() as well (suggested by Qais Yousef) > > kernel/cpu.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/kernel/cpu.c b/kernel/cpu.c > index 1b6302ecbabe..bcd4dd7de9c3 100644 > --- a/kernel/cpu.c > +++ b/kernel/cpu.c > @@ -621,6 +621,10 @@ static int cpuhp_up_callbacks(unsigned int cpu, struct > cpuhp_cpu_state *st, > st->state++; > ret = cpuhp_invoke_callback(cpu, st->state, true, NULL, NULL); > if (ret) { > + pr_debug("CPUHP up callback failure (%d) for cpu %u at > %s (%d)\n", > + ret, cpu, cpuhp_get_step(st->state)->name, > + st->state); > + > if (can_rollback_cpu(st)) { > st->target = prev_state; > undo_cpu_up(cpu, st); > @@ -990,6 +994,10 @@ static int cpuhp_down_callbacks(unsigned int cpu, struct > cpuhp_cpu_state *st, > for (; st->state > target; st->state--) { > ret = cpuhp_invoke_callback(cpu, st->state, false, NULL, NULL); > if (ret) { > + pr_debug("CPUHP down callback failure (%d) for cpu %u > at %s (%d)\n", > + ret, cpu, cpuhp_get_step(st->state)->name, > + st->state); > + > st->target = prev_state; > if (st->state < prev_state) > undo_cpu_down(cpu, st); > -- > 2.17.1 >