On Sun, Jun 02, 2013 at 03:43:07PM +0530, anish kumar wrote:
> Certain watchdog drivers use a timer to keep kicking the watchdog at
> a rate of 0.5s (HZ/2) untill userspace times out.They do this as
> we can't guarantee that watchdog will be pinged fast enough
> for all system loads, especially if timeout is configured for
> less than or equal to 1 second(basically small values).
> 
> As suggested by Wim Van Sebroeck & Guenter Roeck we should
> add this functionality of individual watchdog drivers in the core
> watchdog core.
> 
> Signed-off-by: anish kumar <anish198519851...@gmail.com>

Not exactly what I had in mind. My idea was to enable the softdog only if
the hardware watchdog's maximum timeout was low (say, less than a couple
of minutes), and if a timeout larger than its maximum value was configured.
In that case, I would have set the hardware watchdog to its maximum value
and use the softdog to ping it at a rate of, say, 50% of this maximum.

If userspace would not ping the watchdog within its configured value,
I would stop pinging the hardware watchdog and let it time out.

Guenter

> ---
>  drivers/watchdog/watchdog_dev.c |   34 +++++++++++++++++++++++++++++-----
>  include/linux/watchdog.h        |    1 +
>  2 files changed, 30 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
> index faf4e18..0305803 100644
> --- a/drivers/watchdog/watchdog_dev.c
> +++ b/drivers/watchdog/watchdog_dev.c
> @@ -41,9 +41,14 @@
>  #include <linux/miscdevice.h>        /* For handling misc devices */
>  #include <linux/init.h>              /* For __init/__exit/... */
>  #include <linux/uaccess.h>   /* For copy_to_user/put_user/... */
> +#include <linux/timer.h>
> +#include <linux/jiffies.h>
>  
>  #include "watchdog_core.h"
>  
> +/* Timer heartbeat (500ms) */
> +#define WDT_TIMEOUT  (HZ/2) /* should this be sysfs? */
> +
>  /* the dev_t structure to store the dynamically allocated watchdog devices */
>  static dev_t watchdog_devt;
>  /* the watchdog device behind /dev/watchdog */
> @@ -73,16 +78,33 @@ static int watchdog_ping(struct watchdog_device *wddev)
>       if (!watchdog_active(wddev))
>               goto out_ping;
>  
> -     if (wddev->ops->ping)
> -             err = wddev->ops->ping(wddev);  /* ping the watchdog */
> -     else
> -             err = wddev->ops->start(wddev); /* restart watchdog */
> +     /* should we check ping interval value i.e. timeout value
> +        if it is less than certain threshold then only we 
> +        should add this logic of periodic pinging? */
> +     if (time_before(jiffies, (unsigned long)wddev->timeout)) {
> +             if (wddev->ops->ping)
> +                     err = wddev->ops->ping(wddev);/* ping the watchdog */
> +             else
> +                     err = wddev->ops->start(wddev);/* restart watchdog */
> +             mod_timer(&wddev->timer, jiffies + WDT_TIMEOUT);
> +     } else {
> +             /*
> +              *what we should when we find out that userspace
> +              *has timed out?
> +              **/
> +     }
>  
>  out_ping:
>       mutex_unlock(&wddev->lock);
>       return err;
>  }
>  
> +static void watchdog_ping_wrapper(unsigned long priv)
> +{
> +     struct watchdog_device *wdd = (void *)priv;
> +     watchdog_ping(wdd);
> +}
> +
>  /*
>   *   watchdog_start: wrapper to start the watchdog.
>   *   @wddev: the watchdog device to start
> @@ -109,7 +131,8 @@ static int watchdog_start(struct watchdog_device *wddev)
>       err = wddev->ops->start(wddev);
>       if (err == 0)
>               set_bit(WDOG_ACTIVE, &wddev->status);
> -
> +     
> +     mod_timer(&wddev->timer, jiffies + WDT_TIMEOUT);
>  out_start:
>       mutex_unlock(&wddev->lock);
>       return err;
> @@ -552,6 +575,7 @@ int watchdog_dev_register(struct watchdog_device 
> *watchdog)
>                       old_wdd = NULL;
>               }
>       }
> +     setup_timer(&watchdog->timer, 0, (long)watchdog_ping_wrapper);
>       return err;
>  }
>  
> diff --git a/include/linux/watchdog.h b/include/linux/watchdog.h
> index 2a3038e..e5f18f7 100644
> --- a/include/linux/watchdog.h
> +++ b/include/linux/watchdog.h
> @@ -84,6 +84,7 @@ struct watchdog_device {
>       const struct watchdog_ops *ops;
>       unsigned int bootstatus;
>       unsigned int timeout;
> +     struct timer_list timer;
>       unsigned int min_timeout;
>       unsigned int max_timeout;
>       void *driver_data;
> -- 
> 1.7.10.4
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to