Thanks Petr promptly response. On Fri 2020-10-16 11:51:09, Joseph Jang wrote: > From: josephjang <josephj...@google.com> > > Add suspend timeout handler to prevent device stuck during suspend/ > resume process. Suspend timeout handler will dump disk sleep task > at first round timeout and trigger kernel panic at second round timeout. > The default timer for each round is 30 seconds.
A better solution would be to resume instead of panic(). [Joseph] suspend_timeout() will trigger kernel panic() only when suspend thread stuck (deadlock/hang) for 2*30 seconds. At that moment, I don't know how to resume the suspend thread. So I just could trigger panic to reboot system. If you have better suggestions, I am willing to study it. > Note: Can use following command to simulate suspend hang for testing. > adb shell echo 1 > /sys/power/pm_hang This looks dangerous. It adds a simple way to panic() the system. First, it should get enabled separately. e.g. CONFIG_TEST_PM_SLEEP_MONITOR. Second, I would add it as a module that might get loaded and unloaded. [Joseph] Agree to enable new compile flag for test module. I think it is better to create separate patch for the new test module right? > diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c > index 8b1bb5ee7e5d..6f2679cfd9d1 100644 > --- a/kernel/power/suspend.c > +++ b/kernel/power/suspend.c Using kthread looks like an overkill to me. I wonder how this actually works when the kthreads get freezed. It might be enough to implement just a timer callback. Start the timer in start_suspend_mon() and delete it in stop_suspend_mon(). Or do I miss anything? Anyway, the kthread implementation looks a but hairy. If you really need to use kthread, I suggest to use kthread_worker API. You would need to run an init work to setup the RT scheduling. Then you could just call kthread_queue_delayed_work(() and kthread_cancel_delayed_work_sync() to start and stop the monitor. [Joseph] Actually, I had ever think we just need to use add_timer()/del_timer_sync() for start_suspend_mon()/stop_suspend_mon() before. But I am not sure if add_timer() may cause any performance impact in suspend thread or not. So I try to create a suspend monitor kthread and just flip the flag in suspend thread. Thank you, Joseph. > @@ -114,6 +251,10 @@ static void s2idle_enter(void) > s2idle_state = S2IDLE_STATE_NONE; > raw_spin_unlock_irq(&s2idle_lock); > > +#ifdef CONFIG_PM_SLEEP_MONITOR > + start_suspend_mon(); > +#endif It is better to solve this by defining start_suspend_mon() as empty function when the config option is disabled. For example, see how vgacon_text_force() is defined in console.h. [Joseph] Thank you for good suggestions. May I know if I could use IS_ENABLED() ? if (IS_ENABLED(CONFIG_PM_SLEEP_MONITOR)) start_suspend_mon(); Best Regards, Petr Thank you, Joseph. Petr Mladek <pmla...@suse.com> 於 2020年10月16日 週五 下午5:01寫道: > > On Fri 2020-10-16 11:51:09, Joseph Jang wrote: > > From: josephjang <josephj...@google.com> > > > > Add suspend timeout handler to prevent device stuck during suspend/ > > resume process. Suspend timeout handler will dump disk sleep task > > at first round timeout and trigger kernel panic at second round timeout. > > The default timer for each round is 30 seconds. > > A better solution would be to resume instead of panic(). > > > Note: Can use following command to simulate suspend hang for testing. > > adb shell echo 1 > /sys/power/pm_hang > > This looks dangerous. It adds a simple way to panic() the system. > > First, it should get enabled separately. e.g. > CONFIG_TEST_PM_SLEEP_MONITOR. > > Second, I would add it as a module that might get loaded > and unloaded. > > > diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c > > index 8b1bb5ee7e5d..6f2679cfd9d1 100644 > > --- a/kernel/power/suspend.c > > +++ b/kernel/power/suspend.c > > +static int suspend_monitor_kthread(void *arg) > > +{ > > + long err; > > + struct sched_param param = {.sched_priority > > + = MAX_RT_PRIO-1}; > > + static int timeout_count; > > + static long timeout; > > + > > + pr_info("Init ksuspend_mon thread\n"); > > + > > + sched_setscheduler(current, SCHED_FIFO, ¶m); > > + > > + timeout_count = 0; > > + timeout = MAX_SCHEDULE_TIMEOUT; > > + > > + do { > > + /* Wait suspend timer timeout */ > > + err = wait_event_interruptible_timeout( > > + power_suspend_waitqueue, > > + (suspend_mon_toggle != TOGGLE_NONE), > > + timeout); > > + > > + mutex_lock(&suspend_mon_lock); > > + /* suspend monitor state change */ > > + if (suspend_mon_toggle != TOGGLE_NONE) { > > + if (suspend_mon_toggle == TOGGLE_START) { > > + timeout = msecs_to_jiffies( > > + SUSPEND_TIMER_TIMEOUT_MS); > > + pr_info("Start suspend monitor\n"); > > + } else if (suspend_mon_toggle == TOGGLE_STOP) { > > + timeout = MAX_SCHEDULE_TIMEOUT; > > + timeout_count = 0; > > + pr_info("Stop suspend monitor\n"); > > + } > > + suspend_mon_toggle = TOGGLE_NONE; > > + mutex_unlock(&suspend_mon_lock); > > + continue; > > + } > > + mutex_unlock(&suspend_mon_lock); > > + > > + /* suspend monitor event handler */ > > + if (err == 0) { > > + timeout_count++; > > + suspend_timeout(timeout_count); > > + } else if (err == -ERESTARTSYS) { > > + pr_info("Exit ksuspend_mon!"); > > + break; > > + } > > + } while (1); > > + > > + return 0; > > +} > > Using kthread looks like an overkill to me. I wonder how this actually > works when the kthreads get freezed. It might be enough to implement > just a timer callback. Start the timer in start_suspend_mon() and > delete it in stop_suspend_mon(). Or do I miss anything? > > Anyway, the kthread implementation looks a but hairy. If you really > need to use kthread, I suggest to use kthread_worker API. You would > need to run an init work to setup the RT scheduling. Then you > could just call kthread_queue_delayed_work(() > and kthread_cancel_delayed_work_sync() to start and stop > the monitor. > > > > @@ -114,6 +251,10 @@ static void s2idle_enter(void) > > s2idle_state = S2IDLE_STATE_NONE; > > raw_spin_unlock_irq(&s2idle_lock); > > > > +#ifdef CONFIG_PM_SLEEP_MONITOR > > + start_suspend_mon(); > > +#endif > > It is better to solve this by defining start_suspend_mon() as empty > function when the config option is disabled. For example, see > how vgacon_text_force() is defined in console.h. > > Best Regards, > Petr -- Embedded Software engineer