false positives' as the repeated cancel/restart
of watchdog_timer_fn() prevents the 'watchdog/N' thread from running
(i.e. I think the thread is not prevented from running by something
actually hogging CPU N).
Regards,
Uli
- Original Message -
From: "Josh Hunt&quo
#x27;s patch and it
did not help. After that I did a git bisect to figure out when the soft
lockup was fixed and it appears to be resolved after one of the commits
in this series:
commit 81a4beef91ba4a9e8ad6054ca9933dff7e25ff28
Author: Ulrich Obergfell
Date: Fri Sep 4 15:45:15 2015 -0700
watchdog: introduce watchd
Tejun,
> Sure, separating the knobs out isn't difficult. I still don't like
> the idea of having multiple set of similar knobs controlling about the
> same thing tho.
>
> For example, let's say there's a user who boots with "nosoftlockup"
> explicitly. I'm pretty sure the user wouldn't be inten
mind is: Would the workqueue watchdog
participate in the lockup detector suspend/resume mechanism, and if yes, how
would it be integrated into this ?
Regards,
Uli
- Original Message -
From: "Tejun Heo"
To: "Don Zickus"
Cc: "Ulrich Obergfell" , "Ing
suspended, and the thread could thus
interfere unexpectedly with the code that requested to suspend the
lockup detector. Avoid the race by calling
get_online_cpus() in lockup_detector_suspend()
put_online_cpus() in lockup_detector_resume()
Signed-off-by: Ulrich Obergfell
---
kernel/watchdog.c
This patch set addresses various races in relation to CPU hotplug
and a race in relation to watchdog timer expiry. I discovered the
corner cases during code inspection. I haven't seen any of these
issues occur in practice.
Ulrich Obergfell (4):
watchdog: avoid race between lockup det
ix this by checking the current value of 'watchdog_thresh'.
Signed-off-by: Ulrich Obergfell
---
kernel/watchdog.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 84c4744..18f34cf 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdo
get|put}_online_cpus() in proc_watchdog_common()
{get|put}_online_cpus() in proc_watchdog_thresh()
{get|put}_online_cpus() in proc_watchdog_cpumask()
Signed-off-by: Ulrich Obergfell
---
kernel/watchdog.c | 6 ++
1 file changed, 6 insertions(+)
diff --git a/kernel/watchdog.c b/ker
watchdog_{park|unpark}_threads() are now called in code paths that
protect themselves against CPU hotplug, so {get|put}_online_cpus()
calls are redundant and can be removed.
Signed-off-by: Ulrich Obergfell
---
kernel/watchdog.c | 10 ++
1 file changed, 6 insertions(+), 4 deletions
ke error returns from kthread_park()
in order to test the patches.
Regards,
Uli
- Original Message -
From: "Andrew Morton"
To: "Ulrich Obergfell"
Cc: linux-kernel@vger.kernel.org, dzic...@redhat.com, atom...@redhat.com
Sent: Wednesday, September 30, 2015 1:30:36 AM
Sub
It makes sense to place watchdog_{dis|enable}_all_cpus() outside of
the ifdef so that _both_ are available even if CONFIG_SYSCTL is not
defined.
Signed-off-by: Ulrich Obergfell
---
kernel/watchdog.c | 8 +---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/kernel/watchdog.c b
update_watchdog_all_cpus() now passes errors from watchdog_park_threads()
up to functions in the call chain. This allows watchdog_enable_all_cpus()
and proc_watchdog_update() to handle such errors too.
Signed-off-by: Ulrich Obergfell
---
kernel/watchdog.c | 30 +++---
1
Restore the previous value of watchdog_thresh _and_ sample_period
if proc_watchdog_update() returns an error. The variables must be
consistent to avoid false positives of the lockup detectors.
Signed-off-by: Ulrich Obergfell
---
kernel/watchdog.c | 7 ---
1 file changed, 4 insertions(+), 3
the lockup detectors will soon be disabled by the callers anyway.
Signed-off-by: Ulrich Obergfell
---
kernel/watchdog.c | 10 ++
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 3bc22a9..af70bf2 100644
--- a/kernel/watchdog.c
++
lockup_detector_suspend() now handles errors from watchdog_park_threads().
Signed-off-by: Ulrich Obergfell
---
kernel/watchdog.c | 5 +
1 file changed, 5 insertions(+)
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 457113c..3bc22a9 100644
--- a/kernel/watchdog.c
+++ b/kernel
s. Failure becomes visible to the
user as follows:
- error messages from lockup_detector_suspend()
or watchdog_enable_all_cpus()
- the state that can be read from /proc/sys/kernel/watchdog_enabled
- the 'write' system call in the latter call chain returns an error
Signed-off-by: Ulrich Obergfell
---
arch/x86/kernel/cpu/perf_event_intel.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/perf_event_intel.c
b/arch/x86/kernel/cpu/perf_event_intel.c
index 0357bf7..abb25c3 100644
--- a/arch/x86/kernel/cpu
l=linux-kernel&m=143869949229461&w=2
Ulrich Obergfell (2):
watchdog: rename watchdog_suspend() and watchdog_resume()
watchdog: use pr_debug() in fixup_ht_bug() failure path
arch/x86/kernel/cpu/perf_event_intel.c | 6 +++---
include/linux/nmi.h| 8 +++
watchdog_suspended variables and their relationship.
Signed-off-by: Ulrich Obergfell
---
arch/x86/kernel/cpu/perf_event_intel.c | 4 ++--
include/linux/nmi.h| 8
kernel/watchdog.c | 26 ++
3 files changed, 28 insertions(+), 10
> - Original Message -
> From: "Andrew Morton"
> ...
> On Sat, 1 Aug 2015 14:49:23 +0200 Ulrich Obergfell
> wrote:
>
>> This interface can be utilized to deactivate the hard and soft lockup
>> detector temporarily. Callers are expected to
> - Original Message -
> From: "Michal Hocko"
...
> On Sat 01-08-15 14:49:22, Ulrich Obergfell wrote:
>> These functions are intended to be used only from inside kernel/watchdog.c
>> to park/unpark all watchdog threads that are specified in watchdog_cpuma
> - Original Message -
> From: "Don Zickus"
...
> On Tue, Aug 04, 2015 at 03:31:30PM +0200, Michal Hocko wrote:
>> On Sat 01-08-15 14:49:25, Ulrich Obergfell wrote:
>> [...]
>> > @@ -3368,7 +3368,10 @@ static __init int fix
Peter,
>> I posted the patch set here:
>>
>> https://lkml.org/lkml/2015/8/1/64
>> https://lkml.org/lkml/2015/8/1/65
>> https://lkml.org/lkml/2015/8/1/66
>> https://lkml.org/lkml/2015/8/1/67
>> https://lkml.org/lkml/2015/8/1/68
>
> If only you didn't use lkml.org links, that site is fla
> - Original Message -
> From: "Guenter Roeck"
> ...
> Subject: Re: [PATCH 2/4] watchdog: introduce watchdog_suspend() and
> watchdog_resume()
>
> On Sat, Aug 01, 2015 at 02:49:23PM +0200, Ulrich Obergfell wrote:
>> This interface can be utilized t
Don,
> Uli privately has been working on a patchset that cleans up a bunch of these
> race conditions. We believe it should cover this case. It uses the
> proc_mutex to synchronize everything.
>
> I think he is reaching out to you. If you could try his patchset to see if
> it fixes things, it
This interface can be utilized to deactivate the hard and soft lockup
detector temporarily. Callers are expected to minimize the duration of
deactivation. Multiple deactivations are allowed to occur in parallel
but should be rare in practice.
Signed-off-by: Ulrich Obergfell
---
include/linux
Remove update_watchdog() and restart_watchdog_hrtimer() since these
functions are no longer needed. Changes of parameters such as the
sample period are honored at the time when the watchdog threads are
being unparked.
Signed-off-by: Ulrich Obergfell
---
kernel/watchdog.c | 40
Remove watchdog_nmi_disable_all() and watchdog_nmi_enable_all()
since these functions are no longer needed. If a subsystem has a
need to deactivate the watchdog temporarily, it should utilize the
watchdog_suspend() and watchdog_resume() functions.
Signed-off-by: Ulrich Obergfell
---
arch/x86
These functions are intended to be used only from inside kernel/watchdog.c
to park/unpark all watchdog threads that are specified in watchdog_cpumask.
Signed-off-by: Ulrich Obergfell
---
kernel/watchdog.c | 36
1 file changed, 36 insertions(+)
diff --git a
pected that the duration of deactivation will be short.
Ulrich Obergfell (4):
watchdog: introduce watchdog_park_threads() and
watchdog_unpark_threads()
watchdog: introduce watchdog_suspend() and watchdog_resume()
watchdog: use park/unpark functions in update_watchdog_all_cpus()
w
Frederic,
since you changed the function name, you may want to adjust the comment header
too .
v
/**
* smpboot_register_percpu_thread - Register a per_cpu thread related to
hotplug
* @plug_thread:Hotplug thread descriptor
+ * @cpumask:
Peter,
please see my comments in-line.
Regards,
Uli
- Original Message -
From: "Peter Zijlstra"
To: "Michal Hocko"
[...]
> On Mon, May 18, 2015 at 11:03:37AM +0200, Michal Hocko wrote:
>> This doesn't hang anymore. I've just had to move the mutex definition
>> up to make it compile.
reproduce.
>> > >
>> > > So've tried to bisect ^80dcc31fbe55 e4b0db72be24 and merged 80dcc31fbe55
>> > > in each step.
>> >
>> > Good extra work! Thanks.
>> >
>> > > This lead to:
>> > >
>> > >
- Original Message -
From: "Chris Metcalf"
[...]
On 04/21/2015 08:32 AM, Ulrich Obergfell wrote:
>> Chris,
>>
>> in v9, smpboot_update_cpumask_percpu_thread() allocates 'tmp' mask
>> dynamically.
>> This allocation can fail and thus
- Original Message -
From: "Chris Metcalf"
[...]
> On 04/22/2015 04:20 AM, Ulrich Obergfell wrote:
>> Chris,
>>
>> in principle the change looks o.k. to me, even though I'm not really familiar
>> with the watchdog_nmi_disable_all() and watchd
- Original Message -
From: "Don Zickus"
[...]
> On Tue, Apr 21, 2015 at 10:07:00AM -0400, Ulrich Obergfell wrote:
>>
>> Chris,
>>
[...]
>> I think the user should only be allowed to specify a mask that is a subset of
>> tick_nohz_full_mask as o
> - Original Message -
> From: "Andrew Morton"
> To: "Don Zickus"
> Cc: "LKML" , "Ulrich Obergfell"
>
> Sent: Wednesday, April 22, 2015 10:12:01 PM
> Subject: Re: [PATCH] watchdog: Fix watchdog_nmi_enable_all()
>
> On W
Chris,
in https://lkml.org/lkml/2015/4/17/616 you stated:
">> + alloc_cpumask_var(&watchdog_cpumask_for_smpboot, GFP_KERNEL);
>
> alloc_cpumask_var could fail?
Good catch; if I get a failure I'll just return early without trying to
start the watchdog, since clearly things are too memo
or_init() is executed before fixup_ht_bug().
Regards,
Uli
On 04/16/2015 06:46 AM, Ulrich Obergfell wrote:
> if a user changes watchdog parameters in /proc/sys/kernel, the watchdog
> threads
> are not stopped and restarted in all cases. Parameters can also be changed 'on
> the fl
Chris,
I think it would also be nice to check the plausibility of the user input.
+int proc_watchdog_cpumask(struct ctl_table *table, int write,
+ void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+int err;
+
+mutex_lock(&watchdog_proc_mutex);
+
ker" , "Don Zickus"
, "Ingo Molnar" , "Andrew Morton"
, "Andrew Jones" , "chai wen"
, "Ulrich Obergfell" , "Fabian
Frederick" , "Aaron Tomlin" , "Ben Zhang"
, "Christoph Lameter" , "Gilad Ben-Yos
to determine
whether the NMI watchdog is running, not the content of watchdog_user_enabled.
The attached [RFC PATCH 1/1] has undergone minimal testing only.
Ulrich Obergfell (1):
watchdog: fix watchdog_nmi_enable_all()
kernel/watchdog.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion
ABLED bit
in 'watchdog_enabled'.
Signed-off-by: Ulrich Obergfell
---
kernel/watchdog.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 2316f50..cba2110 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -608,7 +60
Chris,
if a user changes watchdog parameters in /proc/sys/kernel, the watchdog threads
are not stopped and restarted in all cases. Parameters can also be changed 'on
the fly', for example like 'watchdog_thresh' in the following flow of execution:
proc_watchdog_thresh
proc_watchdog_update
hwell"
To: "Andrew Morton" , "Chris Metcalf"
Cc: linux-n...@vger.kernel.org, linux-kernel@vger.kernel.org, "Ulrich
Obergfell" , "Don Zickus"
Sent: Tuesday, April 7, 2015 1:21:53 PM
Subject: linux-next: manual merge of the akpm-current tree with the
Chris,
I'd like to comment on the following proposed change:
+int proc_dowatchdog_exclude(struct ctl_table *table, int write,
+void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+int err;
+
+mutex_lock(&watchdog_proc_mutex);
+err = proc_do_lar
> - Original Message -
> From: "Don Zickus"
> To: "Ulrich Obergfell"
> Cc: linux-kernel@vger.kernel.org
> Sent: Tuesday, October 14, 2014 4:56:12 PM
> Subject: Re: [PATCH 0/1] watchdog: parameters to control hard and soft lockup
> detector indiv
0/13/313 and v2. The new version is
merely a re-base to the following upstream commits.
commit 6e7458a6f074c71e74cda31c483114e65ea0f570
Author: Ulrich Obergfell
Date: Mon Oct 13 15:55:35 2014 -0700
kernel/watchdog.c: control hard lockup detection default
commit 9919e39a17381058dd0cdef2f78
This series introduces a separate handler for each watchdog parameter
in /proc/sys/kernel. The separate handlers need a common function that
they can call to update the run state of the lockup detectors, or to
have the lockup detectors use a new sample period.
Signed-off-by: Ulrich Obergfell
t updates of 'watchdog_enabled' need not be synchronized via
a spinlock or a mutex. Updates can either be atomic or concurrency can
be detected by using 'cmpxchg'.
Signed-off-by: Ulrich Obergfell
---
include/linux/nmi.h | 2 ++
kernel/watchdog.c | 27 +++
Separate handlers for each watchdog parameter in /proc/sys/kernel
replace the proc_dowatchdog() function. Three of those handlers
merely call proc_watchdog_common() with one different argument.
Signed-off-by: Ulrich Obergfell
---
include/linux/nmi.h | 8
kernel/watchdog.c | 59
This series removes the proc_dowatchdog() function. Since multiple
new functions need the 'watchdog_proc_mutex' to serialize access to
the watchdog parameters in /proc/sys/kernel, move the mutex outside
of any function.
Signed-off-by: Ulrich Obergfell
---
kernel/watchdog.c | 3 +
-off-by: Ulrich Obergfell
---
kernel/watchdog.c | 20
1 file changed, 12 insertions(+), 8 deletions(-)
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index cfdb2cb..f435e37 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -641,7 +641,7 @@ static void
boot time.
Also, remove the proc_dowatchdog() function which is no longer needed.
Signed-off-by: Ulrich Obergfell
---
include/linux/nmi.h | 2 --
kernel/sysctl.c | 35 +++
kernel/watchdog.c | 81 +++--
3 files c
Have kvm_guest_init() use hardlockup_detector_disable()
instead of watchdog_enable_hardlockup_detector(false).
Remove the watchdog_hardlockup_detector_is_enabled() and
the watchdog_enable_hardlockup_detector() function which
are no longer needed.
Signed-off-by: Ulrich Obergfell
---
arch/x86
If watchdog_nmi_enable() fails to set up the hardware perf event
of one CPU, the entire hard lockup detector is deemed unreliable.
Hence, disable the hard lockup detector and shut down the hardware
perf events on all CPUs.
Signed-off-by: Ulrich Obergfell
---
kernel/watchdog.c | 18
g the same lock
or mutex in watchdog thread context and in system call context needs to
be considered carefully because it can make the code prone to deadlock
situations in connection with parking/unparking the watchdog threads.]
Signed-off-by: Ulrich Obergfell
---
kernel/watchdog.c | 65 +
they (should) not relate to each other."
Please refer to [PATCH 1/1] for a description of the proposed
changes of the 'user interface' in /proc/sys/kernel and kernel
command line parameters.
Ulrich Obergfell (1):
watchdog: parameters to control hard and soft lockup detector
in
boot time.
Signed-off-by: Ulrich Obergfell
---
include/linux/nmi.h | 12 ++-
kernel/sysctl.c | 21 -
kernel/watchdog.c | 250 +++-
3 files changed, 236 insertions(+), 47 deletions(-)
diff --git a/include/linux/nmi.h b/include/linux/n
Commit-ID: df577149594cefacd62740e86de080c6336d699e
Gitweb: http://git.kernel.org/tip/df577149594cefacd62740e86de080c6336d699e
Author: Ulrich Obergfell
AuthorDate: Mon, 11 Aug 2014 10:49:25 -0400
Committer: Ingo Molnar
CommitDate: Mon, 18 Aug 2014 11:17:46 +0200
watchdog: Fix print
>- Original Message -
>From: "Ingo Molnar"
>To: "Don Zickus"
>Cc: a...@linux-foundation.org, k...@vger.kernel.org, pbonz...@redhat.com,
>mi...@redhat.com, "LKML" , "Ulrich >Obergfell"
>, "Andrew Jones"
>Se
> - Original Message -
> From: "Andrew Jones"
> To: linux-kernel@vger.kernel.org, k...@vger.kernel.org
> Cc: uober...@redhat.com, dzic...@redhat.com, pbonz...@redhat.com,
> a...@linux-foundation.org, mi...@redhat.com
> Sent: Thursday, July 24, 2014 12:13:30 PM
> Subject: [PATCH 2/3] watch
>- Original Message -
>From: "Paolo Bonzini"
>To: "Ulrich Obergfell"
>Cc: "Andrew Jones" , linux-kernel@vger.kernel.org,
>k...@vger.kernel.org, dzic...@redhat.com, a...@linux-foundation.org,
>>mi...@redhat.com
>Sent: Thursday, July
>- Original Message -
>From: "Paolo Bonzini"
>To: "Ulrich Obergfell"
>Cc: "Andrew Jones" , linux-kernel@vger.kernel.org,
>k...@vger.kernel.org, dzic...@redhat.com, a...@linux-foundation.org,
>>mi...@redhat.com
>Sent: Thursday, July
> - Original Message -
> From: "Paolo Bonzini"
> To: "Andrew Jones" , linux-kernel@vger.kernel.org,
> k...@vger.kernel.org
> Cc: uober...@redhat.com, dzic...@redhat.com, a...@linux-foundation.org,
> mi...@redhat.com
> Sent: Thursday, July 24, 2014 12:46:11 PM
> Subject: Re: [PATCH 2/3] w
65 matches
Mail list logo