On 03/08/2020 13:26, benbjiang(蒋彪) wrote: > > >> On Aug 3, 2020, at 4:16 PM, Dietmar Eggemann <dietmar.eggem...@arm.com> >> wrote: >> >> On 01/08/2020 04:32, Jiang Biao wrote: >>> From: Jiang Biao <benbji...@tencent.com> >>> >>> No need to preempt when there are only one runable CFS task with >>> other IDLE tasks on runqueue. The only one CFS task would always >>> be picked in that case. >>> >>> Signed-off-by: Jiang Biao <benbji...@tencent.com> >>> --- >>> kernel/sched/fair.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c >>> index 04fa8dbcfa4d..8fb80636b010 100644 >>> --- a/kernel/sched/fair.c >>> +++ b/kernel/sched/fair.c >>> @@ -4527,7 +4527,7 @@ entity_tick(struct cfs_rq *cfs_rq, struct >>> sched_entity *curr, int queued) >>> return; >>> #endif >>> >>> - if (cfs_rq->nr_running > 1) >>> + if (cfs_rq->nr_running > cfs_rq.idle_h_nr_running + 1) >> >> cfs_rq is a pointer. > It is. Sorry about that. :) > >> >>> check_preempt_tick(cfs_rq, curr); >>> } >> >> You can't compare cfs_rq->nr_running with cfs_rq->idle_h_nr_running! >> >> There is a difference between cfs_rq->h_nr_running and >> cfs_rq->nr_running. The '_h_' stands for hierarchical. >> >> The former gives you hierarchical task accounting whereas the latter is >> the number of sched entities (representing tasks or taskgroups) enqueued >> in cfs_rq. >> >> In entity_tick(), cfs_rq->nr_running has to be used for the condition to >> call check_preempt_tick(). We want to check if curr has to be preempted >> by __pick_first_entity(cfs_rq) on this cfs_rq. >> >> entity_tick() is called for each sched entity (and so for each >> cfs_rq_of(se)) of the task group hierarchy (e.g. task p running in >> taskgroup /A/B : se(p) -> se(A/B) -> se(A)). > That’s true. I was thinking adding a new cfs_rq->idle_nr_running member to > track the per cfs_rq's IDLE task number, and reducing preemption here based > on that.
How would you deal with se's representing taskgroups which contain SCHED_IDLE and SCHED_NORMAL tasks or other taskgroups doing that? > I’m not sure if it’s ok to do that, because the IDLE class seems not to be so > pure that could tolerate starving. Not sure I understand but idle_sched_class is not the same as SCHED_IDLE (policy)? > We need an absolutely low priority class that could tolerate starving, which > could be used to co-locate offline tasks. But IDLE class seems to be not > *low* enough, if considering the fairness of CFS, and IDLE class still has a > weight. [...]