On Fri, Apr 16, 2021 at 10:45:28PM +0200, Thomas Gleixner wrote:
> On Tue, Apr 13 2021 at 21:35, Paul E. McKenney wrote:
> >  #define WATCHDOG_INTERVAL (HZ >> 1)
> >  #define WATCHDOG_THRESHOLD (NSEC_PER_SEC >> 4)
> 
> Didn't we discuss that the threshold is too big ?

Indeed we did!  How about like this, so that WATCHDOG_INTERVAL is at 500
microseconds and we tolerate up to 125 microseconds of delay during the
timer-read process?

I am firing up overnight tests.

                                                        Thanx, Paul

------------------------------------------------------------------------

commit 6c52b5f3cfefd6e429efc4413fd25e3c394e959f
Author: Paul E. McKenney <paul...@kernel.org>
Date:   Fri Apr 16 16:19:43 2021 -0700

    clocksource: Reduce WATCHDOG_THRESHOLD
    
    Currently, WATCHDOG_THRESHOLD is set to detect a 62.5-millisecond skew in
    a 500-millisecond WATCHDOG_INTERVAL.  This requires that clocks be skewed
    by more than 12.5% in order to be marked unstable.  Except that a clock
    that is skewed by that much is probably destroying unsuspecting software
    right and left.  And given that there are now checks for false-positive
    skews due to delays between reading the two clocks, and given that current
    hardware clocks all increment well in excess of 1MHz, it should be possible
    to greatly decrease WATCHDOG_THRESHOLD.
    
    Therefore, decrease WATCHDOG_THRESHOLD from the current 62.5 milliseconds
    down to 500 microseconds.
    
    Suggested-by: Thomas Gleixner <t...@linutronix.de>
    Cc: John Stultz <john.stu...@linaro.org>
    Cc: Stephen Boyd <sb...@kernel.org>
    Cc: Jonathan Corbet <cor...@lwn.net>
    Cc: Mark Rutland <mark.rutl...@arm.com>
    Cc: Marc Zyngier <m...@kernel.org>
    Cc: Andi Kleen <a...@linux.intel.com>
    [ paulmck: Apply Rik van Riel feedback. ]
    Reported-by: Chris Mason <c...@fb.com>
    Signed-off-by: Paul E. McKenney <paul...@kernel.org>

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 8f4967e59b05..4ec19a13dcf0 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -125,8 +125,8 @@ static void __clocksource_change_rating(struct clocksource 
*cs, int rating);
  * Interval: 0.5sec Threshold: 0.0625s
  */
 #define WATCHDOG_INTERVAL (HZ >> 1)
-#define WATCHDOG_THRESHOLD (NSEC_PER_SEC >> 4)
-#define WATCHDOG_MAX_SKEW (NSEC_PER_SEC >> 6)
+#define WATCHDOG_THRESHOLD (500 * NSEC_PER_USEC)
+#define WATCHDOG_MAX_SKEW (WATCHDOG_THRESHOLD / 4)
 
 static void clocksource_watchdog_work(struct work_struct *work)
 {

Reply via email to