Hello, On Tue, Sep 16, 2025 at 10:13:12PM -0700, Ian Rogers wrote: > On Tue, Sep 16, 2025 at 6:47 PM Jinchao Wang <wangjinchao...@gmail.com> wrote: > > > > On Tue, Sep 16, 2025 at 05:03:48PM -0700, Ian Rogers wrote: > > > On Tue, Sep 16, 2025 at 7:51 AM Jinchao Wang <wangjinchao...@gmail.com> > > > wrote: > > > > > > > > Currently, the hard lockup detector is selected at compile time via > > > > Kconfig, which requires a kernel rebuild to switch implementations. > > > > This is inflexible, especially on systems where a perf event may not > > > > be available or may be needed for other tasks. > > > > > > > > This commit refactors the hard lockup detector to replace a rigid > > > > compile-time choice with a flexible build-time and boot-time solution. > > > > The patch supports building the kernel with either detector > > > > independently, or with both. When both are built, a new boot parameter > > > > `hardlockup_detector="perf|buddy"` allows the selection at boot time. > > > > This is a more robust and user-friendly design. > > > > > > > > This patch is a follow-up to the discussion on the kernel mailing list > > > > regarding the preference and future of the hard lockup detectors. It > > > > implements a flexible solution that addresses the community's need to > > > > select an appropriate detector at boot time. > > > > > > > > The core changes are: > > > > - The `perf` and `buddy` watchdog implementations are separated into > > > > distinct functions (e.g., `watchdog_perf_hardlockup_enable`). > > > > - Global function pointers are introduced > > > > (`watchdog_hardlockup_enable_ptr`) > > > > to serve as a single API for the entire feature. > > > > - A new `hardlockup_detector=` boot parameter is added to allow the > > > > user to select the desired detector at boot time. > > > > - The Kconfig options are simplified by removing the complex > > > > `HARDLOCKUP_DETECTOR_PREFER_BUDDY` and allowing both detectors to be > > > > built without mutual exclusion. > > > > - The weak stubs are updated to call the new function pointers, > > > > centralizing the watchdog logic. > > > > > > What is the impact on /proc/sys/kernel/nmi_watchdog ? Is that > > > enabling and disabling whatever the boot time choice was? I'm not sure > > > why this has to be a boot time option given the ability to configure > > > via /proc/sys/kernel/nmi_watchdog. > > The new hardlockup_detector boot parameter and the existing > > /proc/sys/kernel/nmi_watchdog file serve different purposes. > > > > The boot parameter selects the type of hard lockup detector (perf or buddy). > > This choice is made once at boot. > > > > /proc/sys/kernel/nmi_watchdog, on the other hand, is only a simple on/off > > switch for the currently selected detector. It does not change the > > detector's > > type. > > So the name "nmi_watchdog" for the buddy watchdog is wrong for fairly > obvious naming reasons but also because we can't differentiate when a > perf event has been taken or not - this impacts perf that is choosing > not to group events in metrics because of it, reducing the metric's > accuracy. We need an equivalent "buddy_watchdog" file to the > "nmi_watchdog" file. If we have such a file then if I did "echo 1 > > /proc/sys/kernel/nmi_watchdog" I'd expect the buddy watchdog to be > disabled and the perf event one to be enabled. Similarly, if I did > "echo 1 > /proc/sys/kernel/buddy_watchdog" then I would expect the > perf event watchdog to be disabled and the buddy one enabled. If I did > "echo 0 > /proc/sys/kernel/nmi_watchdog; echo 0 > > /proc/sys/kernel/buddy_watchdog" then I'd expect neither to be > enabled. I don't see why choosing the type of watchdog implementation > at boot time is particularly desirable. It seems sensible to default > normal people to using the buddy watchdog (more perf events, power...) > and CONFIG_DEBUG_KERNEL type people to using the perf event one. As > the "nmi_watchdog" file may be assumed to control the buddy watchdog, > perhaps a compatibility option (where the "nmi_watchdog" file controls > the buddy watchdog) is needed so that user code has time to migrate.
Sounds good to me. For perf tools, it'd be great if we can have a run- time check which watchdog is selected. Thanks, Namhyung