On Fri, Jan 08, 2021 at 01:51:11PM +0800, Joseph Jang wrote:
> Since dpm_watchdog just cover two functions __device_suspend() and
> device_resume(), we proposed to move it to core power suspend.c to extend
> its coverage and monitor more devices suspend hand issues.
> 
> We propose to use new name suspend watchdog and new timeout handler to
> cover more sleep hang issues. The new timeout handler will dump disk
> sleep task call trace at first round timeout and trigger kernel panic
> at second round timeout.
> The default timer for each round is defined in
> CONFIG_PM_SUSPEND_WATCHDOG_TIMEOUT.
> 
> Signed-off-by: Joseph Jang <josephj...@google.com>
> ---
> Changes since v4:
>  - Change #define PM_SUSPEND_WATCHDOG to CONFIG_PM_SUSPEND_WATCHDOG in 
> suspend_watchdog.c
>    to make sure we compile all suspend watchdog functions.
>  - Add suspend watchdog functions declared in suspend_watchdog.h to prevent 
> compile errors.
> 
> Changes since v3:
>  - Change the naming from sleep timer to suspend watchdog.
>  - Remove console_is_suspended() from console.h and printk.c
>  - Add static declaration for suspend_watchdog struct since we use it in 
> suspend.c only.
>  - Move suspend watchdog real logic to suspend_watchdog.c.
>  - Put the function prototypes in suspend_watchdog.h
>  - Use inline functions for function prototypes definition.
> 
> Changes since v2:
>  - Add kernel/power/suspend_timer.h in MAINTAINERS
>  - Share some corner cases that dpm_watchdog cannot cover.
>   Case#A: dpm_watchdog cannot cover suspend hang in suspend_enter().
>   File: kernel/power/suspend.c
>   int suspend_devices_and_enter(suspend_state_t state)
>   {
>    ... <SNIP>
> 
>          suspend_test_start();
>          error = dpm_suspend_start(PMSG_SUSPEND);  <== dpm_watchdog will be 
> enabled/disabled in this fucntion ...
>          if (error) {
>                  pr_err("Some devices failed to suspend, or early wake event 
> detected\n");
>                  goto Recover_platform;
>          }
>          suspend_test_finish("suspend devices");
>          if (suspend_test(TEST_DEVICES))
>                  goto Recover_platform;
> 
>          do {
>                  error = suspend_enter(state, &wakeup);  <== but we encounter 
> hang inside function suspend_enter() ...
>          } while (!error && !wakeup && platform_suspend_again(state));
> 
>   Case#B: dpm_watchdog cannot cover resume hang in dpm_resume_early() because 
> it enable/disable in device_resume().
>   Call trace:
>    __switch_to+0x174/0x194
>    __schedule+0xa60/0x1464
>    __cancel_work_timer+0x120/0x234
>    chg_pm_resume+0x2c/0xd8
>    dpm_run_callback+0x27c/0x624
>    device_resume_early+0x1e4/0x1f8
>    dpm_resume_early+0x350/0x8f4
>    suspend_devices_and_enter+0xffc/0x168c
>    pm_suspend+0xb54/0xdac
> 
>   File: drivers/base/power/main.c.
>   static int device_resume(struct device *dev, pm_message_t state, bool async)
>       if (!dpm_wait_for_superior(dev, async))
>           goto Complete;
>   
>       dpm_watchdog_set(&wd, dev);
>       device_lock(dev);
>   
>       /*
>    ... <SNIP>
>  
>       Unlock:
>           device_unlock(dev);
>           dpm_watchdog_clear(&wd);
> 
>   Case#C: dpm_watchdog cannot cover suspend hang in ksys_sync().
>    Call trace:
>     __switch_to+0x1b0/0x1cc
>     __schedule+0xac8/0x1314
>     io_schedule+0x94/0xc8
>     wait_on_page_bit+0x1f8/0x3a4
>     __filemap_fdatawait_range+0x134/0x150
>     sync_inodes_sb+0x368/0x584
>     sync_inodes_one_sb+0x18/0x24
>     iterate_supers+0x12c/0x2b8
>     ksys_sync+0x48/0x12c
>     enter_state+0x294/0x7bc
>     pm_suspend+0x164/0x2a8
> 
>  - Explain some enhancement by following.
>   Q1: Why not use the existing one?
>   struct dpm_watchdog {
>       struct device           *dev;
>       struct task_struct      *tsk;
>       struct timer_list       timer;
>   };
>   
>   A1: In kernel/power/suspend.c, we don't have "struct device " because 
> suspend.c is for core PM instead of device PM.
>   So we propose to use sleep_timer struct.
>   
>   struct sleep_timer {
>       struct task_struct      *tsk;
>       struct timer_list       timer;
>   };
>   
>   
>   Q2: Why not use stack memory for timer struct?
>   static void dpm_watchdog_set(struct dpm_watchdog *wd, struct device *dev)
>   {
>       ... <SNIP>  
>       timer_setup_on_stack(timer, dpm_watchdog_handler, 0);  <== dpm_watchdog 
> use stack memory for timer struct.
>       /* use same timeout value for both suspend and resume */
>       timer->expires = jiffies + HZ * CONFIG_DPM_WATCHDOG_TIMEOUT;
>       add_timer(timer);
>   }
>   
>   A2: dpm_watchdog use stack memory has limitation.
>   It cannot support starting watchdog at end of function like s2idle_enter().
>   We cannot use stack memory for this case because the timer struct will be 
> free when go back to caller.
>   
>   File: kernel/power/suspend.c
>   static void s2idle_enter(void)
>     {
>           trace_suspend_resume(TPS("machine_suspend"), PM_SUSPEND_TO_IDLE, 
> true);
>   
>   +       stop_sleep_timer(&st);
>   +
>           raw_spin_lock_irq(&s2idle_lock);
>           if (pm_wakeup_pending())
>                   goto out;
>   ... <SNIP>
>           s2idle_state = S2IDLE_STATE_NONE;
>           raw_spin_unlock_irq(&s2idle_lock);
>   
>   +       start_sleep_timer(&st);
>   +
>           trace_suspend_resume(TPS("machine_suspend"), PM_SUSPEND_TO_IDLE, 
> false);
>     }
>   
>   So we propose to declare a global sleep timer struct in suspend.c like 
> following.
>   
>   File: kernel/power/suspend.c
>    static DEFINE_RAW_SPINLOCK(s2idle_lock);
>    
>   +DECLARE_SLEEP_TIMER(st);
>   +
>    /**
>     * pm_suspend_default_s2idle - Check if suspend-to-idle is the default 
> suspend.
>   
>   Using global timer has another advantage is we could support double calling 
> add_timer()
>   because add_timer() can easily to delete pending timer and add new timer by 
> using the
>   same timer struct.
> 
>   Q3: Why do you need to change the timeout handler?
>   A3: dpm_watchdog_handler() need device struct to get device information, 
> but suspend.c doesn't have it.
>   In most of cases, we could know which device driver hang by tracing the 
> suspend thread call stack logs
>   without device struct information for debugging. So we propose to remove 
> device struct and related
>   information like "dev_driver_string(wd->dev)" and "dev_name(wd->dev)" from 
> timeout handler.
>   
>    static void dpm_watchdog_handler(struct timer_list *t)
>    {
>        struct dpm_watchdog *wd = from_timer(wd, t, timer);
>   
>        dev_emerg(wd->dev, "**** DPM device timeout ****\n");
>        show_stack(wd->tsk, NULL, KERN_EMERG);
>        panic("%s %s: unrecoverable failure\n",
>            dev_driver_string(wd->dev), dev_name(wd->dev));  <== we don't need 
> it.
>    }
>   
>    We propose to dump all disk sleep tasks call stack log twice to double 
> confirm suspend thread hang at
>    the same function and make sure other sleep tasks were stuck at the same 
> function for a long time.
>    We also try to resume conole if the console is suspended.
>    At the end of watchdog timeout handler, we propose to trigger kernel panic 
> to prevent system hang like
>    dpm_watchdog.
>   
>   
>   static void sleep_timeout_handler(struct timer_list *t)
>   {
>       struct sleep_timer *st = from_timer(st, t, timer);
>       static int timeout_count;
>   
>       pr_info("Sleep timeout (timer is %d seconds)\n",
>           (CONFIG_PM_SLEEP_TIMER_TIMEOUT));
>       show_stack(st->tsk, NULL, KERN_EMERG);
>       show_state_filter(TASK_UNINTERRUPTIBLE);
>   
>       if (timeout_count < 1) {
>           timeout_count++;
>           start_sleep_timer(st);
>           return;
>       }
>   
>       if (console_is_suspended())
>           resume_console();
>   
>       panic("Sleep timeout and panic\n");
>   }
> 
> Changes since v1:
>   - Add commit message to explain why move dpm_watchdog to 
> kernel/power/suspend.c.
>   - Remove dpm_watchdog related functions in drivers/base/power/main.c.
>   - Move suspend_timer.h from include/linux/ to kernel/power/.
> ---
>  MAINTAINERS                     |  4 ++
>  drivers/base/power/main.c       | 69 ---------------------------
>  kernel/power/Kconfig            | 27 +++++------
>  kernel/power/Makefile           |  1 +
>  kernel/power/suspend.c          | 19 ++++++++
>  kernel/power/suspend_watchdog.c | 84 +++++++++++++++++++++++++++++++++
>  kernel/power/suspend_watchdog.h | 40 ++++++++++++++++
>  kernel/printk/printk.c          |  2 +-
>  8 files changed, 162 insertions(+), 84 deletions(-)
>  create mode 100644 kernel/power/suspend_watchdog.c
>  create mode 100644 kernel/power/suspend_watchdog.h
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 867157311dc8..ecd988b4a838 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -7788,6 +7788,8 @@ F:      include/linux/freezer.h
>  F:   include/linux/pm.h
>  F:   include/linux/suspend.h
>  F:   kernel/power/
> +F:   kernel/power/suspend_watchdog.c
> +F:   kernel/power/suspend_watchdog.h
>  
>  HID CORE LAYER
>  M:   Jiri Kosina <ji...@kernel.org>
> @@ -16630,6 +16632,8 @@ F:    include/linux/freezer.h
>  F:   include/linux/pm.h
>  F:   include/linux/suspend.h
>  F:   kernel/power/
> +F:   kernel/power/suspend_watchdog.c
> +F:   kernel/power/suspend_watchdog.h

As you are now asking other people to maintain the code that you just
added, you will need to get their signed-off-by on this patch before I
can do anything with it.

thanks,

greg k-h

Reply via email to