Hello,

On Tue, Sep 16, 2025 at 10:13:12PM -0700, Ian Rogers wrote:
> On Tue, Sep 16, 2025 at 6:47 PM Jinchao Wang <wangjinchao...@gmail.com> wrote:
> >
> > On Tue, Sep 16, 2025 at 05:03:48PM -0700, Ian Rogers wrote:
> > > On Tue, Sep 16, 2025 at 7:51 AM Jinchao Wang <wangjinchao...@gmail.com> 
> > > wrote:
> > > >
> > > > Currently, the hard lockup detector is selected at compile time via
> > > > Kconfig, which requires a kernel rebuild to switch implementations.
> > > > This is inflexible, especially on systems where a perf event may not
> > > > be available or may be needed for other tasks.
> > > >
> > > > This commit refactors the hard lockup detector to replace a rigid
> > > > compile-time choice with a flexible build-time and boot-time solution.
> > > > The patch supports building the kernel with either detector
> > > > independently, or with both. When both are built, a new boot parameter
> > > > `hardlockup_detector="perf|buddy"` allows the selection at boot time.
> > > > This is a more robust and user-friendly design.
> > > >
> > > > This patch is a follow-up to the discussion on the kernel mailing list
> > > > regarding the preference and future of the hard lockup detectors. It
> > > > implements a flexible solution that addresses the community's need to
> > > > select an appropriate detector at boot time.
> > > >
> > > > The core changes are:
> > > > - The `perf` and `buddy` watchdog implementations are separated into
> > > >   distinct functions (e.g., `watchdog_perf_hardlockup_enable`).
> > > > - Global function pointers are introduced 
> > > > (`watchdog_hardlockup_enable_ptr`)
> > > >   to serve as a single API for the entire feature.
> > > > - A new `hardlockup_detector=` boot parameter is added to allow the
> > > >   user to select the desired detector at boot time.
> > > > - The Kconfig options are simplified by removing the complex
> > > >   `HARDLOCKUP_DETECTOR_PREFER_BUDDY` and allowing both detectors to be
> > > >   built without mutual exclusion.
> > > > - The weak stubs are updated to call the new function pointers,
> > > >   centralizing the watchdog logic.
> > >
> > > What is the impact on  /proc/sys/kernel/nmi_watchdog ? Is that
> > > enabling and disabling whatever the boot time choice was? I'm not sure
> > > why this has to be a boot time option given the ability to configure
> > > via /proc/sys/kernel/nmi_watchdog.
> > The new hardlockup_detector boot parameter and the existing
> > /proc/sys/kernel/nmi_watchdog file serve different purposes.
> >
> > The boot parameter selects the type of hard lockup detector (perf or buddy).
> > This choice is made once at boot.
> >
> >  /proc/sys/kernel/nmi_watchdog, on the other hand, is only a simple on/off
> > switch for the currently selected detector. It does not change the 
> > detector's
> > type.
> 
> So the name "nmi_watchdog" for the buddy watchdog is wrong for fairly
> obvious naming reasons but also because we can't differentiate when a
> perf event has been taken or not - this impacts perf that is choosing
> not to group events in metrics because of it, reducing the metric's
> accuracy. We need an equivalent "buddy_watchdog" file to the
> "nmi_watchdog" file. If we have such a file then if I did "echo 1 >
> /proc/sys/kernel/nmi_watchdog" I'd expect the buddy watchdog to be
> disabled and the perf event one to be enabled. Similarly, if I did
> "echo 1 > /proc/sys/kernel/buddy_watchdog" then I would expect the
> perf event watchdog to be disabled and the buddy one enabled. If I did
>  "echo 0 > /proc/sys/kernel/nmi_watchdog; echo 0 >
> /proc/sys/kernel/buddy_watchdog" then I'd expect neither to be
> enabled. I don't see why choosing the type of watchdog implementation
> at boot time is particularly desirable. It seems sensible to default
> normal people to using the buddy watchdog (more perf events, power...)
> and  CONFIG_DEBUG_KERNEL type people to using the perf event one. As
> the "nmi_watchdog" file may be assumed to control the buddy watchdog,
> perhaps a compatibility option (where the "nmi_watchdog" file controls
> the buddy watchdog) is needed so that user code has time to migrate.

Sounds good to me.  For perf tools, it'd be great if we can have a run-
time check which watchdog is selected.

Thanks,
Namhyung


Reply via email to