Hi Peter, On Mon, Nov 03, 2014 at 11:41:11AM +0100, Peter Zijlstra wrote: >On Fri, Oct 31, 2014 at 03:28:17PM +0800, Wanpeng Li wrote: >> Hi all, >> >> I observe that dl task can't be migrated to other cpus during cpu hotplug, >> in >> addition, task may/may not be running again if cpu is added back. The root >> cause >> which I found is that dl task will be throtted and removed from dl rq after >> comsuming all budget, which leads to stop task can't pick it up from dl rq >> and >> migrate to other cpus during hotplug. >> >> So I try two methods. >> >> - add throttled dl sched_entity to a throttled_list, the list will be >> traversed >> during cpu hotplug, and the dl sched_entity will be picked and enqueue, >> then >> stop task will pick and migrate it. However, dl sched_entity is throttled >> again >> before stop task running since the below path. This path will set >> rq->online 0 >> which lead to set_rq_offline() won't be called in function >> migration_call(). >> > >This seems wrong to me; this screws around with the CBS by replenishing >too soon.
Agreed. > >> @@ -1593,9 +1602,20 @@ static void rq_online_dl(struct rq *rq) >> /* Assumes rq->lock is held */ >> static void rq_offline_dl(struct rq *rq) >> { >> + struct task_struct *p, *n; >> + >> if (rq->dl.overloaded) >> dl_clear_overload(rq); >> >> + /* Make sched_dl_entity available for pick_next_task() */ >> + list_for_each_entry_safe(p, n, &rq->dl.throttled_list, >> dl.throttled_node) { >> + p->dl.dl_throttled = 0; >> + hrtimer_cancel(&p->dl.dl_timer); >> + p->dl.dl_runtime = p->dl.dl_runtime; >> + if (task_on_rq_queued(p)) >> + enqueue_task_dl(rq, p, ENQUEUE_REPLENISH); >> + } >> + >> cpudl_set(&rq->rd->cpudl, rq->cpu, 0, 0); >> } > > >So what is wrong with making dl_task_timer() deal with it? The timer >will still fire on the correct time, canceling it and or otherwise >messing with the CBS is wrong. Once it fires, all we need to do is >migrate it to another cpu (preferably one that is still online of course >:-). Do you mean what I need to do is push the task to another cpu in dl_task_timer() if rq is offline? In addition, what will happen if dl task can't preempt on another cpu? Regards, Wanpeng Li -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/