On Mon, Feb 16, 2015 at 03:38:34PM +0300, Kirill Tkhai wrote: > We shouldn't enqueue migrating tasks. Please, try this one instead ;)
Ha, we should amend that task-rq-lock loop for that. See below. I've not yet tested; going to try and reconstruct a .config that triggers the oops. --- Subject: sched/dl: Prevent enqueue of a sleeping task in dl_task_timer() From: Kirill Tkhai <tk...@yandex.ru> Date: Mon, 16 Feb 2015 15:38:34 +0300 A deadline task may be throttled and dequeued at the same time. This happens, when it becomes throttled in schedule(), which is called to go to sleep: current->state = TASK_INTERRUPTIBLE; schedule() deactivate_task() dequeue_task_dl() update_curr_dl() start_dl_timer() __dequeue_task_dl() prev->on_rq = 0; Later the timer fires, but the task is still dequeued: dl_task_timer() enqueue_task_dl() /* queues on dl_rq; on_rq remains 0 */ Someone wakes it up: try_to_wake_up() enqueue_dl_entity() BUG_ON(on_dl_rq()) Patch fixes this problem, it prevents queueing !on_rq tasks on dl_rq. Also teach the rq-lock loop about TASK_ON_RQ_MIGRATING as per cca26e8009d1 ("sched: Teach scheduler to understand TASK_ON_RQ_MIGRATING state"). Fixes: 1019a359d3dc ("sched/deadline: Fix stale yield state") Cc: Ingo Molnar <mi...@kernel.org> Cc: Juri Lelli <juri.le...@arm.com> Reported-by: Fengguang Wu <fengguang...@intel.com> Signed-off-by: Kirill Tkhai <ktk...@parallels.com> [peterz: Wrote comment; fixed task-rq-lock loop] Signed-off-by: Peter Zijlstra (Intel) <pet...@infradead.org> Link: http://lkml.kernel.org/r/1374601424090...@web4j.yandex.ru --- kernel/sched/deadline.c | 25 ++++++++++++++++++++++--- 1 file changed, 22 insertions(+), 3 deletions(-) --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -515,9 +515,8 @@ static enum hrtimer_restart dl_task_time again: rq = task_rq(p); raw_spin_lock(&rq->lock); - - if (rq != task_rq(p)) { - /* Task was moved, retrying. */ + if (rq != task_rq(p) || task_on_rq_migrating(p)) { + /* Task was move{d,ing}, retry */ raw_spin_unlock(&rq->lock); goto again; } @@ -541,6 +540,26 @@ static enum hrtimer_restart dl_task_time sched_clock_tick(); update_rq_clock(rq); + + /* + * If the throttle happened during sched-out; like: + * + * schedule() + * deactivate_task() + * dequeue_task_dl() + * update_curr_dl() + * start_dl_timer() + * __dequeue_task_dl() + * prev->on_rq = 0; + * + * We can be both throttled and !queued. Replenish the counter + * but do not enqueue -- wait for our wakeup to do that. + */ + if (!task_on_rq_queued(p)) { + replenish_dl_entity(dl_se, dl_se); + goto unlock; + } + enqueue_task_dl(rq, p, ENQUEUE_REPLENISH); if (dl_task(rq->curr)) check_preempt_curr_dl(rq, p, 0); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/