On Tue, Feb 24, 2015 at 09:28:35AM +0000, Juri Lelli wrote:
>dl_task_timer() may fire on a different rq from where a task was removed
>after throttling. Since the call path is:
>
>  dl_task_timer() ->
>    enqueue_task_dl() ->
>      enqueue_dl_entity() ->
>        replenish_dl_entity()
>
>and replenish_dl_entity() uses dl_se's rq, we can't use current's rq
>in dl_task_timer(), but we need to lock the task's previous one.
>
>Signed-off-by: Juri Lelli <juri.le...@arm.com>

Tested-by: Wanpeng Li <wanpeng...@linux.intel.com>

I see a panic when try to run a dl task and kill the task after several 
seconds than retry the process several times, the bug is triggered by 
commit 3960c8c0c789 ("sched: Make dl_task_time() use task_rq_lock()"), 
Juri's patch fix it.

[  313.352676] BUG: unable to handle kernel NULL pointer dereference at (null)
[  313.353483] IP: [<ffffffff8139ee28>] rb_erase+0x118/0x390
[  313.354060] PGD b5ddb067 PUD b5d96067 PMD 0 
[  313.354501] Oops: 0002 [#1] SMP 
[...]
[  313.356633] Call Trace:
[  313.356633]  [<ffffffff810b2cb7>] dequeue_pushable_dl_task+0x47/0x80
[  313.356633]  [<ffffffff810b46ff>] pick_next_task_dl+0x7f/0x150
[  313.356633]  [<ffffffff8178f7b9>] __schedule+0x839/0x8cb
[  313.356633]  [<ffffffff8178f947>] schedule+0x37/0x90
[  313.356633]  [<ffffffff8178fbae>] schedule_preempt_disabled+0xe/0x10
[  313.356633]  [<ffffffff810b5b18>] cpu_startup_entry+0x168/0x380
[  313.356633]  [<ffffffff810eb2e3>] ? clockevents_register_device+0xe3/0x150
[  313.356633]  [<ffffffff810eba96>] ? clockevents_config_and_register+0x26/0x30
[  313.356633]  [<ffffffff8104a96c>] start_secondary+0x14c/0x170
[  313.356633] Code: e2 fc 74 ab 48 89 c1 48 89 d0 48 8b 50 08 48 39 ca 74 48 
f6 02 01 75 b3 48 8b 4a 10 48 89 c7 48 83 cf 01 48 89 48 08 48 89
42 10 <48> 89 39 48 8b 38 48 89 3a 48 83 e7 fc 48 89 10 0f 84 02 01 00 
[  313.356633] RIP  [<ffffffff8139ee28>] rb_erase+0x118/0x390
[  313.356633]  RSP <ffff8800ba3efdc8>
[  313.356633] CR2: 0000000000000000
[  313.356633] ---[ end trace 5fbbfdbbc196604d ]---
[  313.356633] Kernel panic - not syncing: Attempted to kill the idle task!
[  313.356633] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 
0xffffffff80000000-0xffffffff9fffffff)

>Cc: Ingo Molnar <mi...@redhat.com>
>Cc: Peter Zijlstra <pet...@infradead.org>
>Cc: Kirill Tkhai <ktk...@parallels.com>
>Cc: Juri Lelli <juri.le...@gmail.com>
>Cc: linux-kernel@vger.kernel.org
>Fixes: 3960c8c0c789 ("sched: Make dl_task_time() use task_rq_lock()")
>---
> kernel/sched/deadline.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
>diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
>index dbf12a9..519e468 100644
>--- a/kernel/sched/deadline.c
>+++ b/kernel/sched/deadline.c
>@@ -538,7 +538,7 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer 
>*timer)
>       unsigned long flags;
>       struct rq *rq;
> 
>-      rq = task_rq_lock(current, &flags);
>+      rq = task_rq_lock(p, &flags);
> 
>       /*
>        * We need to take care of several possible races here:
>@@ -593,7 +593,7 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer 
>*timer)
>               push_dl_task(rq);
> #endif
> unlock:
>-      task_rq_unlock(rq, current, &flags);
>+      task_rq_unlock(rq, p, &flags);
> 
>       return HRTIMER_NORESTART;
> }
>-- 
>2.3.0
>
>--
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majord...@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to