Re: perf overlapping maps...

2018-10-22 Thread Don Zickus
On Mon, Oct 22, 2018 at 06:16:13PM +0200, Jiri Olsa wrote: > On Mon, Oct 22, 2018 at 10:07:38AM -0400, Don Zickus wrote: > > (adding Jiri) > > > > On Fri, Oct 19, 2018 at 09:44:01PM -0700, David Miller wrote: > > > From: David Miller > > > Dat

Re: perf overlapping maps...

2018-10-22 Thread Don Zickus
> >header->misc. > > > > 2) Use this to elide the map groups clone in > >thread__clone_map_groups(). > > Looking into code history, I notice: > > commit 363b785f3805a2632eb09a8b430842461c21a640 > Author: Don Zickus > Date: Fri Mar 14 10:43:4

Re: [PATCH] get_maintainer: Allow option --mpath to read all files in

2018-08-15 Thread Don Zickus
ile or a directory. > > The behaviors are now: > > --mpath Read only the specific file as file > --mpath Read all files in as > files > --mpath --find-maintainer-files > Recurse through and read all files named > MAINTAINERS

Re: [PATCH] watchdog: Reduce message verbosity

2018-08-01 Thread Don Zickus
On Mon, Jul 30, 2018 at 12:43:34PM -0700, Sinan Kaya wrote: > Hi Don, > > On 7/30/2018 12:28 PM, Don Zickus wrote: > > > [0.152492] NMI watchdog: Perf event create on CPU 0 failed with -2 > > > [0.156002] NMI watchdog: Perf NMI watchdog permanently disabled

Re: [PATCH] watchdog: Reduce message verbosity

2018-07-30 Thread Don Zickus
On Mon, Jul 30, 2018 at 12:09:47PM -0700, Sinan Kaya wrote: > Reducing the verbosity level to debug for people that are interested in > debugging watchdog issues. > > [0.152492] NMI watchdog: Perf event create on CPU 0 failed with -2 > [0.156002] NMI watchdog: Perf NMI watchdog permanently

Re: [PATCH] get_maintainer.pl: Add optional .get_maintainer.MAINTAINERS override

2018-07-30 Thread Don Zickus
On Mon, Jul 16, 2018 at 05:20:19PM -0400, Don Zickus wrote: > On Fri, Jul 13, 2018 at 05:11:58PM -0700, Joe Perches wrote: > > On Fri, 2018-07-13 at 14:51 -0400, Don Zickus wrote: > > > On Fri, Jul 06, 2018 at 03:14:28PM -0700, Joe Perches wrote: > > > > On Fri,

Re: [PATCH] get_maintainer.pl: Add optional .get_maintainer.MAINTAINERS override

2018-07-16 Thread Don Zickus
On Fri, Jul 13, 2018 at 05:11:58PM -0700, Joe Perches wrote: > On Fri, 2018-07-13 at 14:51 -0400, Don Zickus wrote: > > On Fri, Jul 06, 2018 at 03:14:28PM -0700, Joe Perches wrote: > > > On Fri, 2018-07-06 at 15:09 -0700, Joe Perches wrote: > > > > On Fri, 2018-07

Re: [PATCH] get_maintainer.pl: Add optional .get_maintainer.MAINTAINERS override

2018-07-13 Thread Don Zickus
On Fri, Jul 06, 2018 at 03:14:28PM -0700, Joe Perches wrote: > On Fri, 2018-07-06 at 15:09 -0700, Joe Perches wrote: > > On Fri, 2018-07-06 at 17:58 -0400, Don Zickus wrote: > > > We have an internal use case of multiple MAINTAINER files, some folks have > > > more right

Re: [PATCH] get_maintainer.pl: Add optional .get_maintainer.MAINTAINERS override

2018-07-06 Thread Don Zickus
On Fri, Jul 06, 2018 at 03:14:28PM -0700, Joe Perches wrote: > On Fri, 2018-07-06 at 15:09 -0700, Joe Perches wrote: > > On Fri, 2018-07-06 at 17:58 -0400, Don Zickus wrote: > > > We have an internal use case of multiple MAINTAINER files, some folks have > > > more right

Re: [PATCH] get_maintainer.pl: Add optional .get_maintainer.MAINTAINERS override

2018-07-06 Thread Don Zickus
On Fri, Jul 06, 2018 at 03:14:28PM -0700, Joe Perches wrote: > On Fri, 2018-07-06 at 15:09 -0700, Joe Perches wrote: > > On Fri, 2018-07-06 at 17:58 -0400, Don Zickus wrote: > > > We have an internal use case of multiple MAINTAINER files, some folks have > > > more right

Re: [PATCH] get_maintainer.pl: Add optional .get_maintainer.MAINTAINERS override

2018-07-06 Thread Don Zickus
On Fri, Jul 06, 2018 at 03:09:17PM -0700, Joe Perches wrote: > On Fri, 2018-07-06 at 17:58 -0400, Don Zickus wrote: > > On Fri, Jul 06, 2018 at 02:36:28PM -0700, Joe Perches wrote: > > > > > > > Just trying to find ways to minimize our collection of

Re: [PATCH] get_maintainer.pl: Add optional .get_maintainer.MAINTAINERS override

2018-07-06 Thread Don Zickus
On Fri, Jul 06, 2018 at 02:36:28PM -0700, Joe Perches wrote: > > > > > Just trying to find ways to minimize our collection of private > > > > > patches. > > > > > > > > Perhaps that could be extended for your purpose > > > > with some additional argument like a specific > > > > optional directory

Re: [PATCH] get_maintainer.pl: Add optional .get_maintainer.MAINTAINERS override

2018-07-06 Thread Don Zickus
On Fri, Jul 06, 2018 at 11:31:13AM -0700, Joe Perches wrote: > On Fri, 2018-07-06 at 13:54 -0400, Don Zickus wrote: > > On Tue, Jun 26, 2018 at 01:16:11PM -0700, Joe Perches wrote: > > > On Tue, 2018-06-26 at 14:25 -0400, Prarit Bhargava wrote: > > > > OSes have addi

Re: [PATCH] get_maintainer.pl: Add optional .get_maintainer.MAINTAINERS override

2018-07-06 Thread Don Zickus
On Fri, Jul 06, 2018 at 11:31:13AM -0700, Joe Perches wrote: > On Fri, 2018-07-06 at 13:54 -0400, Don Zickus wrote: > > On Tue, Jun 26, 2018 at 01:16:11PM -0700, Joe Perches wrote: > > > On Tue, 2018-06-26 at 14:25 -0400, Prarit Bhargava wrote: > > > > OSes have addi

Re: [PATCH] get_maintainer.pl: Add optional .get_maintainer.MAINTAINERS override

2018-07-06 Thread Don Zickus
On Tue, Jun 26, 2018 at 01:16:11PM -0700, Joe Perches wrote: > On Tue, 2018-06-26 at 14:25 -0400, Prarit Bhargava wrote: > > OSes have additional maintainers that should be cc'd on patches or may > > want to circulate internal patches. > > > > Parse the .get_maintainer.MAINTAINERS file. Entries i

Re: [PATCH] Documentation: Better document the hardlockup_panic sysctl

2017-12-11 Thread Don Zickus
corresponding entry in Documentation/admin-guide/kernel-parameters.txt. Acked-by: Don Zickus > > Signed-off-by: Scott Wood > --- > Documentation/admin-guide/kernel-parameters.txt | 3 +++ > Documentation/sysctl/kernel.txt | 14 ++ > 2 files cha

[tip:core/urgent] watchdog/hardlockup/perf: Use atomics to track in-use cpu counter

2017-11-01 Thread tip-bot for Don Zickus
Commit-ID: 42f930da7f00c0ab23df4c7aed36137f35988980 Gitweb: https://git.kernel.org/tip/42f930da7f00c0ab23df4c7aed36137f35988980 Author: Don Zickus AuthorDate: Wed, 1 Nov 2017 14:11:27 -0400 Committer: Thomas Gleixner CommitDate: Wed, 1 Nov 2017 21:18:40 +0100 watchdog/hardlockup/perf

[tip:core/urgent] watchdog/hardlockup/perf: Use atomics to track in-use cpu counter

2017-11-01 Thread tip-bot for Don Zickus
Commit-ID: c7254c8aabe3025770fdb6f2d84aded11716ca2b Gitweb: https://git.kernel.org/tip/c7254c8aabe3025770fdb6f2d84aded11716ca2b Author: Don Zickus AuthorDate: Wed, 1 Nov 2017 14:11:27 -0400 Committer: Thomas Gleixner CommitDate: Wed, 1 Nov 2017 20:41:28 +0100 watchdog/hardlockup/perf

Re: Crashes in perf_event_ctx_lock_nested

2017-11-01 Thread Don Zickus
On Tue, Oct 31, 2017 at 03:11:07PM -0700, Guenter Roeck wrote: > On Tue, Oct 31, 2017 at 10:32:00PM +0100, Thomas Gleixner wrote: > > [ ...] > > > So we have to revert > > > > a33d44843d45 ("watchdog/hardlockup/perf: Simplify deferred event destroy") > > > > Patch attached. > > > > Tested-by

Re: Crashes in perf_event_ctx_lock_nested

2017-10-31 Thread Don Zickus
> > Is Chrome OS, changing the default timeout from 10s to something else? > > That would explain it as a script is executed late in the boot cycle and > > explain the quick restart. > > > > Correct, Chrome OS changes the timeout from 10 to 5 seconds. > > A little experiment suggests that the pr

Re: Crashes in perf_event_ctx_lock_nested

2017-10-31 Thread Don Zickus
On Tue, Oct 31, 2017 at 10:16:22AM -0700, Guenter Roeck wrote: > On Tue, Oct 31, 2017 at 02:48:50PM +0100, Peter Zijlstra wrote: > > On Mon, Oct 30, 2017 at 03:45:12PM -0700, Guenter Roeck wrote: > > > I added some logging and a long msleep() in > > > hardlockup_detector_perf_cleanup(). > > > Here

Re: Crashes in perf_event_ctx_lock_nested

2017-10-31 Thread Don Zickus
On Mon, Oct 30, 2017 at 03:45:12PM -0700, Guenter Roeck wrote: > Hi Thomas, > > we are seeing the following crash in v4.14-rc5/rc7 if > CONFIG_HARDLOCKUP_DETECTOR > is enabled. > > [5.908021] NMI watchdog: Enabled. Permanently consumes one hw-PMU counter. > [5.915836] > =

Re: [patch V2 22/29] lockup_detector: Make watchdog_nmi_reconfigure() two stage

2017-10-05 Thread Don Zickus
inux/kernel/git/tip/tip.git WIP.core/urgent > > That's based on 4.13 final so it neither contains 4.14 nor -next material. Tested your changes on 4.14-rc3 and it passes my tests. Thanks! Tested-and-Reviewed-by: Don Zickus

Re: [RFC GIT Pull] core watchdog sanitizing

2017-10-02 Thread Don Zickus
On Mon, Oct 02, 2017 at 07:32:57PM +, Thomas Gleixner wrote: > On Mon, 2 Oct 2017, Linus Torvalds wrote: > > Side note: would it perhaps make sense to have that > > cpus_read_lock/unlock() sequence around the whole reconfiguration > > section? > > > > Because while looking at that sequence, it

Re: [PATCH][next] kernel: watchdog: fix spelling mistake: "permanetely" -> "permanently"

2017-09-26 Thread Don Zickus
On Tue, Sep 26, 2017 at 09:36:03AM +, Colin King wrote: > From: Colin Ian King > > trivial fix to spelling mistake in pr_info message Acked-by: Don Zickus > > Signed-off-by: Colin Ian King > --- > kernel/watchdog_hld.c | 2 +- > 1 file changed, 1 insertion(+),

Re: [patch V2 04/29] parisc: Use lockup_detector_stop()

2017-09-14 Thread Don Zickus
tine. > > > > Signed-off-by: Thomas Gleixner > > Cc: Don Zickus > > Cc: Chris Metcalf > > Cc: linux-par...@vger.kernel.org > > Cc: Peter Zijlstra > > Cc: Sebastian Siewior > > Cc: Nicholas Piggin > > Cc: Ulrich Obergfell > >

Re: [patch V2 00/29] lockup_detector: Cure hotplug deadlocks and replace duct tape

2017-09-13 Thread Don Zickus
h makes it unreadable > > - There is more wreckage, but see the changelogs for the ugly details. > Aside from the simple compile issue in patch 25. I have no issues with this patchset. Thanks Thomas! Reviewed-by: Don Zickus > The following series sanitizes the facility and addresses

Re: [patch V2 25/29] lockup_detector: Implement init time detection of perf

2017-09-13 Thread Don Zickus
homas Gleixner > Cc: Don Zickus > Cc: Chris Metcalf > Cc: Peter Zijlstra > Cc: Sebastian Siewior > Cc: Nicholas Piggin > Cc: Ulrich Obergfell > Cc: Borislav Petkov > Cc: Andrew Morton > Link: http://lkml.kernel.org/r/20170831073054.997264...@linutronix.de > &g

Re: [patch 00/29] lockup_detector: Cure hotplug deadlocks and replace duct tape

2017-09-07 Thread Don Zickus
On Thu, Aug 31, 2017 at 09:15:58AM +0200, Thomas Gleixner wrote: > The lockup detector is broken is several ways: > > - It's deadlock prone vs. CPU hotplug in various ways. Some of these > are due to recursive cpus_read_lock() others are due to > cpus_read_lock() from CPU hotplug c

Re: [patch 24/29] lockup_detector/perf: Implement init time perf validation

2017-09-07 Thread Don Zickus
On Thu, Aug 31, 2017 at 09:16:22AM +0200, Thomas Gleixner wrote: > The watchdog tries to create perf events even after it figured out that > perf is not functional or the requested event is not supported. > > That's braindead as this can be done once at init time and if not supported > the NMI wat

Re: [patch 11/29] lockup_detector: Remove park_in_progress hackery

2017-09-05 Thread Don Zickus
On Mon, Sep 04, 2017 at 02:10:50PM +0200, Peter Zijlstra wrote: > On Mon, Sep 04, 2017 at 01:09:06PM +0200, Ulrich Obergfell wrote: > > > - A thread hogs CPU N (soft lockup) so that watchdog/N is unable to run. > > - A user re-configures 'watchdog_thresh' on the fly. The reconfiguration > > requ

Re: [patch 10/29] lockup_detector/perf: Prevent cpu hotplug deadlock

2017-09-05 Thread Don Zickus
On Fri, Sep 01, 2017 at 09:29:07PM +0200, Thomas Gleixner wrote: > On Fri, 1 Sep 2017, Don Zickus wrote: > > On Thu, Aug 31, 2017 at 09:16:08AM +0200, Thomas Gleixner wrote: > > > The following deadlock is possible in the watchdog hotplug code: > > &g

Re: [patch 17/29] lockup_detector: Get rid of the thread teardown/setup dance

2017-09-01 Thread Don Zickus
On Thu, Aug 31, 2017 at 09:16:15AM +0200, Thomas Gleixner wrote: > The lockup detector reconfiguration tears down all watchdog threads when > the watchdog is disabled and sets them up again when its enabled. > > That's a pointless exercise. The watchdog threads are not consuming an > insane amount

Re: [patch 10/29] lockup_detector/perf: Prevent cpu hotplug deadlock

2017-09-01 Thread Don Zickus
On Thu, Aug 31, 2017 at 09:16:08AM +0200, Thomas Gleixner wrote: > The following deadlock is possible in the watchdog hotplug code: > > cpus_write_lock() > ... > takedown_cpu() > smpboot_park_threads() > smpboot_park_thread() > kthread_park() >

Re: [patch 00/29] lockup_detector: Cure hotplug deadlocks and replace duct tape

2017-08-31 Thread Don Zickus
On Thu, Aug 31, 2017 at 09:15:58AM +0200, Thomas Gleixner wrote: > The lockup detector is broken is several ways: > > - It's deadlock prone vs. CPU hotplug in various ways. Some of these > are due to recursive cpus_read_lock() others are due to > cpus_read_lock() from CPU hotplug c

Re: [PATCH] kernel/watchdog: fix Kconfig constraints for perf hardlockup watchdog

2017-08-10 Thread Don Zickus
assumed an arch went one way or the other. I think this is a good workaround for now. Acked-by: Don Zickus > > Fixes: 05a4a9527931 ("kernel/watchdog: split up config options") > Signed-off-by: Nicholas Piggin > --- > > arch/powerpc/Kconfig | 2 +- > arch/x86/

Re: [PATCH] x86/nmi: Use raw lock

2017-07-25 Thread Don Zickus
where we register an nmi handler. Acked-by: Don Zickus > > Signed-off-by: Scott Wood > --- > arch/x86/kernel/nmi.c | 18 +- > 1 file changed, 9 insertions(+), 9 deletions(-) > > diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c > index 446c8aa0

Re: [PATCH V2] kernel/watchdog: fix spurious hard lockups

2017-07-17 Thread Don Zickus
On Mon, Jul 17, 2017 at 01:24:23AM +, Liang, Kan wrote: > Hi Don & Thomas, > > Sorry for the late response. We just finished the tests for all proposed > patches. > > There are three proposed patches so far. > Patch 1: The patch as above which speed up the hrtimer. > Patch 2: Thomas's first

Re: [PATCH V2] kernel/watchdog: fix spurious hard lockups

2017-06-29 Thread Don Zickus
On Thu, Jun 29, 2017 at 09:12:20AM -0700, Andi Kleen wrote: > On Thu, Jun 29, 2017 at 11:44:06AM -0400, Don Zickus wrote: > > On Wed, Jun 28, 2017 at 01:14:04PM -0700, Andi Kleen wrote: > > > It can be a useful debugging tool for a specific class of bugs: > > > when

Re: [PATCH V2] kernel/watchdog: fix spurious hard lockups

2017-06-29 Thread Don Zickus
On Wed, Jun 28, 2017 at 01:14:04PM -0700, Andi Kleen wrote: > It can be a useful debugging tool for a specific class of bugs: > when kernel software is looping forever. > > But if that happens does it really matter how many iterations the > loop does before it is stopped? > > Even the current ti

Re: [PATCH V2] kernel/watchdog: fix spurious hard lockups

2017-06-28 Thread Don Zickus
On Tue, Jun 27, 2017 at 04:48:22PM -0700, Andi Kleen wrote: > > I haven't heard back any test result yet. > > > > The above patch looks good to me. > > This needs performance testing. It may slow down performance or latency > sensitive workloads. More motivation to work through the issues with

Re: [PATCH V2] kernel/watchdog: fix spurious hard lockups

2017-06-27 Thread Don Zickus
On Tue, Jun 27, 2017 at 08:49:19PM +, Liang, Kan wrote: > > > On Mon, Jun 26, 2017 at 04:19:27PM -0400, Don Zickus wrote: > > > On Fri, Jun 23, 2017 at 11:50:25PM +0200, Thomas Gleixner wrote: > > > > On Fri, 23 Jun 2017, Don Zickus wrote: > > > >

Re: [PATCH V2] kernel/watchdog: fix spurious hard lockups

2017-06-27 Thread Don Zickus
On Mon, Jun 26, 2017 at 04:19:27PM -0400, Don Zickus wrote: > On Fri, Jun 23, 2017 at 11:50:25PM +0200, Thomas Gleixner wrote: > > On Fri, 23 Jun 2017, Don Zickus wrote: > > > Hmm, all this work for a temp fix. Kan, how much longer until the real > > > fix > >

Re: [PATCH V2] kernel/watchdog: fix spurious hard lockups

2017-06-26 Thread Don Zickus
On Fri, Jun 23, 2017 at 11:50:25PM +0200, Thomas Gleixner wrote: > On Fri, 23 Jun 2017, Don Zickus wrote: > > Hmm, all this work for a temp fix. Kan, how much longer until the real fix > > of having perf count the right cycles? > > Quite a while. The approach is wilfully br

Re: [PATCH V2] kernel/watchdog: fix spurious hard lockups

2017-06-23 Thread Don Zickus
On Fri, Jun 23, 2017 at 10:01:55AM +0200, Thomas Gleixner wrote: > On Thu, 22 Jun 2017, Don Zickus wrote: > > On Wed, Jun 21, 2017 at 11:53:57PM +0200, Thomas Gleixner wrote: > > > On Wed, 21 Jun 2017, kan.li...@intel.com wrote: > > > > We now have more and more sy

Re: [PATCH V2] kernel/watchdog: fix spurious hard lockups

2017-06-22 Thread Don Zickus
On Wed, Jun 21, 2017 at 11:53:57PM +0200, Thomas Gleixner wrote: > On Wed, 21 Jun 2017, kan.li...@intel.com wrote: > > We now have more and more systems where the Turbo range is wide enough > > that the NMI watchdog expires faster than the soft watchdog timer that > > updates the interrupt tick the

Re: [PATCH] kernel/watchdog: fix spurious hard lockups

2017-06-21 Thread Don Zickus
On Wed, Jun 21, 2017 at 12:40:28PM +, Liang, Kan wrote: > > > > > > > The right fix for mainline can be found here. > > > perf/x86/intel: enable CPU ref_cycles for GP counter perf/x86/intel, > > > watchdog: Switch NMI watchdog to ref cycles on x86 > > > https://patchwork.kernel.org/patch/97790

Re: [PATCH] kernel/watchdog: hide unused function

2017-06-21 Thread Don Zickus
n] > > This adds another #ifdef around it. Thanks! Acked-by: Don Zickus > > Fixes: mmotm ("kernel/watchdog: provide watchdog_nmi_reconfigure() for arch > watchdogs") > Signed-off-by: Arnd Bergmann > --- > kernel/watchdog.c | 4 > 1 file changed

Re: [PATCH] kernel/watchdog: fix spurious hard lockups

2017-06-21 Thread Don Zickus
On Tue, Jun 20, 2017 at 02:33:09PM -0700, kan.li...@intel.com wrote: > From: Kan Liang > > Some users reported spurious NMI watchdog timeouts. > > We now have more and more systems where the Turbo range is wide enough > that the NMI watchdog expires faster than the soft watchdog timer that > upd

Re: [PATCH v4 0/5] Improve watchdog config for arch watchdogs

2017-06-16 Thread Don Zickus
watchdog patches seem to go via Andrew... Andrew, suggestions here? Reviewed-by: Don Zickus Acked-by: Don Zickus > > Thanks, > Nick > > Nicholas Piggin (5): > watchdog: remove unused declaration > watchdog: introduce arch_touch_nmi_watchdog() > watchdog: spl

Re: [PATCH 3/4] watchdog: Split up config options

2017-06-15 Thread Don Zickus
On Fri, Jun 16, 2017 at 01:59:00AM +1000, Nicholas Piggin wrote: > On Thu, 15 Jun 2017 11:51:22 -0400 > Don Zickus wrote: > > > On Thu, Jun 15, 2017 at 01:04:01PM +1000, Nicholas Piggin wrote: > > > > +#ifdef CONFIG_HARDLOCKUP_DETECTOR

Re: [PATCH 3/4] watchdog: Split up config options

2017-06-15 Thread Don Zickus
On Thu, Jun 15, 2017 at 01:04:01PM +1000, Nicholas Piggin wrote: > > +#ifdef CONFIG_HARDLOCKUP_DETECTOR > > /* boot commands */ > > /* > >* Should we panic when a soft-lockup or hard-lockup occurs: > > @@ -69,9 +73,6 @@ static int __init hardlockup_panic_setup(char *str) > > return

Re: [PATCH 3/4] watchdog: Split up config options

2017-06-14 Thread Don Zickus
On Wed, Jun 14, 2017 at 02:11:18AM +1000, Nicholas Piggin wrote: > > Yeah, if you wouldn't mind. Sorry for dragging this out, but I feel like we > > are getting close to have this defined properly which would allow us to > > split the code up correctly in the future. > > How's this for a replacem

Re: [PATCH 3/4] watchdog: Split up config options

2017-06-12 Thread Don Zickus
On Mon, Jun 12, 2017 at 06:07:39PM +1000, Nicholas Piggin wrote: > > > This would probably be the right direction to go in, but it will take > > > slightly more I think. We first need to remove HAVE_NMI_WATCHDOG from > > > meaning that an arch has its own watchdog and does not want any HLD > > > st

Re: [PATCH 3/4] watchdog: Split up config options

2017-06-08 Thread Don Zickus
On Wed, Jun 07, 2017 at 01:50:26PM +1000, Nicholas Piggin wrote: > > > > I _think_ having > > > > depends on LOCKUP_DETECTOR > > depends on HAVE_NMI_WATCHDOG || HAVE_PERF_EVENTS_NMI > > select HARDLOCKUP_DETECTOR_PERF if !HAVE_NMI_WATCHDOG > > > > will work because your new definition of HARDLOC

Re: [PATCH 0/4][V3] Improve watchdog config for arch watchdogs

2017-06-07 Thread Don Zickus
On Tue, Jun 06, 2017 at 02:46:48PM -0500, Babu Moger wrote: > Hi Don, Nicholas, > > > On 6/6/2017 11:08 AM, Don Zickus wrote: > > (adding Babu) > > > > On Tue, May 30, 2017 at 11:26:55AM +1000, Nicholas Piggin wrote: > > > Since last time: > > >

Re: [PATCH 3/4] watchdog: Split up config options

2017-06-06 Thread Don Zickus
On Sat, Jun 03, 2017 at 04:10:05PM +1000, Nicholas Piggin wrote: > > My last concern is wrapping my head around the config options. > > > > HAVE_NMI_WATCHDOG seems to have a dual meaning, I think. > > Yeah it's not the clearest. I think we need another pass over config > options to start straight

Re: [PATCH 0/4][V3] Improve watchdog config for arch watchdogs

2017-06-06 Thread Don Zickus
(adding Babu) On Tue, May 30, 2017 at 11:26:55AM +1000, Nicholas Piggin wrote: > Since last time: > > - Have the perf based hardlockup detector use arch_touch_nmi_watchdog() > rather than hld_touch_nmi_watchdog(). This changes direction slightly > to make the perf-based hard lockup detector a

Re: [PATCH 3/4] watchdog: Split up config options

2017-06-02 Thread Don Zickus
On Tue, May 30, 2017 at 11:26:58AM +1000, Nicholas Piggin wrote: > Split SOFTLOCKUP_DETECTOR from LOCKUP_DETECTOR, and split > HARDLOCKUP_DETECTOR_PERF from HARDLOCKUP_DETECTOR. > > LOCKUP_DETECTOR provides the boot, sysctl, and programming interfaces > for lockup detectors. An architecture that d

Re: [PATCH 4/4] watchdog: provide watchdog_reconfigure() for arch watchdogs

2017-05-26 Thread Don Zickus
On Fri, May 26, 2017 at 10:39:09AM +1000, Nicholas Piggin wrote: > On Thu, 25 May 2017 10:08:33 -0400 > Don Zickus wrote: > > > On Thu, May 25, 2017 at 06:28:56PM +1000, Nicholas Piggin wrote: > > > After reconfiguring watchdog sysctls etc., architecture specific > &g

Re: [PATCH 2/4] watchdog: introduce arch_touch_nmi_watchdog()

2017-05-26 Thread Don Zickus
On Fri, May 26, 2017 at 10:31:03AM +1000, Nicholas Piggin wrote: > On Thu, 25 May 2017 09:55:59 -0400 > Don Zickus wrote: > > > On Thu, May 25, 2017 at 06:28:54PM +1000, Nicholas Piggin wrote: > > > For architectures that define HAVE_NMI_WATCHDOG, instead of havin

Re: [PATCH 4/4] watchdog: provide watchdog_reconfigure() for arch watchdogs

2017-05-25 Thread Don Zickus
On Thu, May 25, 2017 at 06:28:56PM +1000, Nicholas Piggin wrote: > After reconfiguring watchdog sysctls etc., architecture specific > watchdogs may not get all their parameters updated. > > watchdog_reconfigure() can be implemented to pull the new values > in and set the arch NMI watchdog. I unde

Re: [PATCH 2/4] watchdog: introduce arch_touch_nmi_watchdog()

2017-05-25 Thread Don Zickus
On Thu, May 25, 2017 at 06:28:54PM +1000, Nicholas Piggin wrote: > For architectures that define HAVE_NMI_WATCHDOG, instead of having > them provide the complete touch_nmi_watchdog() function, just have > them provide arch_touch_nmi_watchdog(). > > This gives the generic code more flexibility in i

Re: [RFC] arch hardlockup detector interfaces improvement

2017-05-19 Thread Don Zickus
On Sat, May 20, 2017 at 12:53:06AM +1000, Nicholas Piggin wrote: > > I am curious to know what IBM thinks there. Currently the HARDLOCKUP > > detector sits on top of perf. I get the impression, you are removing that > > dependency. Is that a permanent thing or are you thinking of switching back

Re: [RFC] arch hardlockup detector interfaces improvement

2017-05-19 Thread Don Zickus
On Fri, May 19, 2017 at 09:07:31AM +1000, Nicholas Piggin wrote: > On Thu, 18 May 2017 12:30:28 -0400 > Don Zickus wrote: > > > (adding Uli) > > > > On Fri, May 19, 2017 at 01:50:26AM +1000, Nicholas Piggin wrote: > > > I'd like to make it easier

Re: [RFC] arch hardlockup detector interfaces improvement

2017-05-18 Thread Don Zickus
(adding Uli) On Fri, May 19, 2017 at 01:50:26AM +1000, Nicholas Piggin wrote: > I'd like to make it easier for architectures that have their own NMI / > hard lockup detector to reuse various configuration interfaces that are > provided by generic detectors (cmdline, sysctl, suspend/resume calls).

Re: [PATCH 1/2] x86/platform: Add a low priority low frequency NMI call chain

2017-03-07 Thread Don Zickus
On Tue, Mar 07, 2017 at 08:00:33AM -0800, Mike Travis wrote: > > > On 3/7/2017 7:22 AM, Don Zickus wrote: > > On Tue, Mar 07, 2017 at 08:42:10AM +0100, Ingo Molnar wrote: > >> > >> * Mike Travis wrote: > >> > >>> Add a new NMI call

Re: [PATCH 1/2] x86/platform: Add a low priority low frequency NMI call chain

2017-03-07 Thread Don Zickus
On Tue, Mar 07, 2017 at 08:42:10AM +0100, Ingo Molnar wrote: > > * Mike Travis wrote: > > > Add a new NMI call chain that is called last after all other NMI handlers > > have been checked and did not "handle" the NMI. This mimics the current > > NMI_UNKNOWN call chain except it eliminates the W

Re: [PATCH] kernel/watchdog.c: Do not hardcode CPU 0 as the initial thread

2017-01-05 Thread Don Zickus
(cc'ing Andrew) On Tue, Jan 03, 2017 at 04:19:50PM -0500, Prarit Bhargava wrote: > > > On 12/01/2016 03:06 PM, Don Zickus wrote: > > On Tue, Nov 29, 2016 at 08:15:21AM -0500, Prarit Bhargava wrote: > >> When CONFIG_BOOTPARAM_HOTPLUG_CPU0 is enabled, the socket con

Re: [PATCH] x86/kbuild: enable modversions for symbols exported from asm

2016-12-14 Thread Don Zickus
On Sat, Dec 10, 2016 at 01:41:03PM +0100, Greg Kroah-Hartman wrote: > On Fri, Dec 09, 2016 at 11:46:54PM +0100, Dodji Seketeli wrote: > > Hello, > > > > Nicholas Piggin a �crit: > > > > [...] > > > > > That said, a dwarf based checker tool should be able to do as good a job > > > (maybe a bit b

Re: [PATCH] x86/kbuild: enable modversions for symbols exported from asm

2016-12-09 Thread Don Zickus
On Fri, Dec 09, 2016 at 01:50:41PM +1000, Nicholas Piggin wrote: > > > > We have plenty of customers with 10 year old drivers, where the expertise > > has long left the company. The engineers still around, recompile and make > > tweaks to get things working on the latest RHEL. Verify it passes t

[PATCH] kernel/watchdog: Prevent false hardlockup on overloaded system

2016-12-06 Thread Don Zickus
Signed-off-by: Don Zickus --- include/linux/nmi.h | 1 + kernel/watchdog.c | 9 + kernel/watchdog_hld.c | 3 +++ 3 files changed, 13 insertions(+) diff --git a/include/linux/nmi.h b/include/linux/nmi.h index 0ea0a38..67e3392 100644 --- a/include/linux/nmi.h +++ b/include/linux/nmi.h

Re: [PATCH] kernel/watchdog.c: Do not hardcode CPU 0 as the initial thread

2016-12-01 Thread Don Zickus
he last watchdog thread is > disabled. > > This patch is based on top of linux-next akpm-base. It passed my tests. Thanks! Acked-by: Don Zickus > > Signed-off-by: Prarit Bhargava > Cc: Borislav Petkov > Cc: Tejun Heo > Cc: Don Zickus > Cc: Hidehiro Kawai >

Re: [PATCH] x86/kbuild: enable modversions for symbols exported from asm

2016-12-01 Thread Don Zickus
On Thu, Dec 01, 2016 at 05:06:11PM +0100, Greg Kroah-Hartman wrote: > On Thu, Dec 01, 2016 at 10:40:59AM -0500, Don Zickus wrote: > > Unfortunately, there are various drivers that will never go upstream > > > > - paid storage drivers that provide bells and whistles on top

Re: [PATCH] x86/kbuild: enable modversions for symbols exported from asm

2016-12-01 Thread Don Zickus
On Thu, Dec 01, 2016 at 07:26:09AM -0800, Christoph Hellwig wrote: > On Thu, Dec 01, 2016 at 10:20:39AM -0500, Don Zickus wrote: > > > > - provide the memory allocation (instead of having the driver staticly > > allocate) > > - provide functions to retrieve variou

Re: [PATCH] x86/kbuild: enable modversions for symbols exported from asm

2016-12-01 Thread Don Zickus
On Thu, Dec 01, 2016 at 03:32:15PM +1100, Nicholas Piggin wrote: > > Anyway, MODVERSIONS is our way of protecting our kabi for the last 10 years. > > It isn't perfect and we have fixed the genksyms tool over the years, but so > > far it mostly works fine. > > Okay. It would be good to get all the

Re: [PATCH] x86/kbuild: enable modversions for symbols exported from asm

2016-11-30 Thread Don Zickus
On Wed, Nov 30, 2016 at 10:40:02AM -0800, Linus Torvalds wrote: > On Wed, Nov 30, 2016 at 10:18 AM, Nicholas Piggin wrote: > > > > Here's an initial rough hack at removing modversions. It gives an idea > > of the complexity we're carrying for this feature (keeping in mind most > > of the lines rem

Re: [PATCH] kernel/watchdog.c: Do not hardcode CPU 0 as the initial thread

2016-11-29 Thread Don Zickus
nd verify things later this week. Cheers, Don > > Signed-off-by: Prarit Bhargava > Cc: Borislav Petkov > Cc: Tejun Heo > Cc: Don Zickus > Cc: Hidehiro Kawai > Cc: Thomas Gleixner > Cc: Andi Kleen > Cc: Joshua Hunt > Cc: Ingo Molnar > Cc: Babu Moger >

Re: [PATCH] kernel/watchdog.c: Only output hw-PMU message once

2016-11-21 Thread Don Zickus
me. Hmm, it occurred to me, with pr_info_once, what happens if you disable and re-enable, is this still printed? echo 0 > /proc/sys/kernel/watchdog echo 1 > /proc/sys/kernel/watchdog Cheers, Don > > Signed-off-by: Prarit Bhargava > Cc: Borislav Petkov > Cc: Tejun Heo > Cc: Don Zic

Re: [PATCH v2 0/3] Clean up watchdog handlers

2016-11-04 Thread Don Zickus
o implement their own > handlers. watchdog_nmi_enable and watchdog_nmi_disable will be defined > as weak such that architectures can override its definitions. > > Thanks to Don Zickus for his suggestions. > Here are our previous discussions > http://www.spinics.net/lists/sparc

Re: [RFC PATCH 0/4] Clean up watchdog handlers

2016-11-01 Thread Don Zickus
On Mon, Oct 31, 2016 at 04:30:59PM -0500, Babu Moger wrote: > > On 10/31/2016 4:00 PM, Don Zickus wrote: > >On Wed, Oct 26, 2016 at 09:02:19AM -0700, Babu Moger wrote: > >>This is an attempt to cleanup watchdog handlers. Right now, > >>kernel/watchdog.c implements

Re: [RFC PATCH 0/4] Clean up watchdog handlers

2016-10-31 Thread Don Zickus
o implement their own > handlers. watchdog_nmi_enable and watchdog_nmi_disable will be defined > as weak such that architectures can override its definitions. > > Thanks to Don Zickus for his suggestions. > Here is the previous discussion > http://www.spinics.net/lists/sparclinux/msg

Re: [RFC PATCH 0/4] Clean up watchdog handlers

2016-10-27 Thread Don Zickus
o implement their own > handlers. watchdog_nmi_enable and watchdog_nmi_disable will be defined > as weak such that architectures can override its definitions. Thanks for the patches Babu. I will try to get to them today or tomorrow. Cheers, Don > > Thanks to Don Zickus for his suggestio

Re: [PATCH v2 1/2] watchdog: Introduce arch_watchdog_nmi_enable and arch_watchdog_nmi_disable

2016-10-24 Thread Don Zickus
On Fri, Oct 21, 2016 at 04:50:21PM -0500, Babu Moger wrote: > Don, > > On 10/21/2016 2:19 PM, Andrew Morton wrote: > >On Fri, 21 Oct 2016 11:11:14 -0400 Don Zickus wrote: > > > >>On Thu, Oct 20, 2016 at 08:25:27PM -0700, Andrew Morton wrote: > >>>On

Re: [PATCH v2 1/2] watchdog: Introduce arch_watchdog_nmi_enable and arch_watchdog_nmi_disable

2016-10-21 Thread Don Zickus
On Thu, Oct 20, 2016 at 08:25:27PM -0700, Andrew Morton wrote: > On Thu, 20 Oct 2016 12:14:14 -0400 Don Zickus wrote: > > > > > -static int watchdog_nmi_enable(unsigned int cpu) { return 0; } > > > > -static void watchdog_nmi_disable(unsigned int cpu) { return; }

Re: [PATCH v2 1/2] watchdog: Introduce arch_watchdog_nmi_enable and arch_watchdog_nmi_disable

2016-10-20 Thread Don Zickus
On Wed, Oct 19, 2016 at 05:00:12PM -0700, Andrew Morton wrote: > On Thu, 13 Oct 2016 13:38:01 -0700 Babu Moger wrote: > > > Currently we do not have a way to enable/disable arch specific > > watchdog handlers if it was implemented by any of the architectures. > > > > This patch introduces new fu

Re: [PATCH v3 0/2] Introduce arch specific nmi enable, disable handlers

2016-10-18 Thread Don Zickus
on x86. Thanks Babu! Tested-and-Reviewed-by: Don Zickus > > v3: > Made one more change per Don Zickus comments. > Moved failure path messages to into generic code inside watchdog_nmi_enable. > Also added matching prints in sparc to warn about the failure. > > v2: >

Re: [PATCH v2 1/2] watchdog: Introduce arch_watchdog_nmi_enable and arch_watchdog_nmi_disable

2016-10-17 Thread Don Zickus
On Thu, Oct 13, 2016 at 01:38:01PM -0700, Babu Moger wrote: > Currently we do not have a way to enable/disable arch specific > watchdog handlers if it was implemented by any of the architectures. > > This patch introduces new functions arch_watchdog_nmi_enable and > arch_watchdog_nmi_disable which

Re: [PATCH v2 0/2] Introduce arch specific nmi enable, disable handlers

2016-10-17 Thread Don Zickus
rg's comments about making the definitions visible. > With the new approach we dont need those definitions((NMI_WATCHDOG_ENABLED, > SOFT_WATCHDOG_ENABLED etc..) outside watchdog.c. So no action. > > b) Made changes per Don Zickus comments. > Don, I could not use your p

Re: [PATCH 0/2] Introduce update_arch_nmi_watchdog for arch specific handlers

2016-10-07 Thread Don Zickus
On Thu, Oct 06, 2016 at 03:16:41PM -0700, Babu Moger wrote: > During our testing we noticed that nmi watchdogs in sparc could not be > disabled or > enabled dynamically using sysctl/proc interface. Sparc uses its own arch > specific > nmi watchdogs. There is a sysctl and proc > interface(proc/sy

Re: [PATCH 05/61] perf tools: Introduce c2c_decode_stats function

2016-09-21 Thread Don Zickus
On Wed, Sep 21, 2016 at 11:18:29AM +0200, Jiri Olsa wrote: > On Wed, Sep 21, 2016 at 09:08:40AM +, Stanislav Ievlev wrote: > > Hi, Jiri! > > > > Why are you not using unsigned integer for counters in c2c_stats structure? > > hi, > never really thought of that, because that's one of the origin

Re: [PATCH 4.5 142/238] watchdog: dont run proc_watchdog_update if new value is same as old

2016-04-13 Thread Don Zickus
tchdog: enabled on all CPUs, permanently consumes one hw-PMU > > counter. > >   NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU > > counter. > > > > There doesn't appear to be a reason for doing this work every time a write > > occurs,

Re: [PATCH 1/2] watchdog: Fix output

2016-03-19 Thread Don Zickus
On Fri, Mar 18, 2016 at 05:48:58PM +0100, Peter Zijlstra wrote: > On Fri, Mar 18, 2016 at 05:44:41PM +0100, Peter Zijlstra wrote: > > On Fri, Mar 18, 2016 at 12:37:48PM -0400, Don Zickus wrote: > > > Would something like this be a better patch? > > > > > -#define

Re: [PATCH 1/2] watchdog: Fix output

2016-03-19 Thread Don Zickus
- pr_emerg("BUG: soft lockup - CPU#%d stuck for %us! [%s:%d]\n", + pr_emerg("Detected soft lockup - CPU#%d stuck for %us! [%s:%d]\n", smp_processor_id(), duration, current->comm, task_pid_nr(c

Re: [PATCH 3.4 098/107] kernel/watchdog.c: touch_nmi_watchdog should only touch local cpu not every one

2016-03-18 Thread Don Zickus
l of the watchdogs (on each cpu). Perhaps a corner case > : will pop up (the scheduler?? to mimic touch_all_softlockup_watchdogs() ). > : > : But this does address an issue where if a system is locked up and one cpu > : is spewing out useful debug messages (or error messages), the hard

Re: [PATCH v2] watchdog: don't run proc_watchdog_update if new value is same as old

2016-03-15 Thread Don Zickus
, permanently consumes one > hw-PMU counter. > [ 955.783182] NMI watchdog: enabled on all CPUs, permanently consumes one > hw-PMU counter. > > There doesn't appear to be a reason for doing this work every time a write > occurs, so only do it when the values change. Acked-by:

Re: [PATCH] watchdog: don't run proc_watchdog_update if new value is same as old

2016-03-15 Thread Don Zickus
On Mon, Mar 14, 2016 at 11:02:31PM -0500, Josh Hunt wrote: > On 03/14/2016 11:29 AM, Don Zickus wrote: > > > >Hi Josh, > > > >I believe Uli thought the below patch might fix it. > > > >Cheers, > >Don > > Don > > It looks like I was

Re: [PATCH] watchdog: don't run proc_watchdog_update if new value is same as old

2016-03-14 Thread Don Zickus
On Mon, Mar 14, 2016 at 09:45:26AM -0500, Josh Hunt wrote: > On 03/14/2016 09:34 AM, Don Zickus wrote: > >On Sat, Mar 12, 2016 at 06:50:26PM -0500, Joshua Hunt wrote: > >>While working on a script to restore all sysctl params before a series of > >>tests I found that

Re: [PATCH] watchdog: don't run proc_watchdog_update if new value is same as old

2016-03-14 Thread Don Zickus
On Sat, Mar 12, 2016 at 06:50:26PM -0500, Joshua Hunt wrote: > While working on a script to restore all sysctl params before a series of > tests I found that writing any value into the > /proc/sys/kernel/{nmi_watchdog,soft_watchdog,watchdog,watchdog_thresh} > causes them to call proc_watchdog_updat

Re: [PATCH v5 3/3] Add BUG_XX() debugging hard/soft lockup detection

2016-02-03 Thread Don Zickus
On Wed, Feb 03, 2016 at 10:23:42AM -0700, Jeffrey Merkey wrote: > > Hmm, I am confused here. So you are saying because we are in the nmi > > handler you can not break into the system? The nmi handler prints some > > stuff to the screen, pokes the other cpus to print stuff to the screen and > > th

  1   2   3   4   5   6   7   >