On Mon, Oct 12, 2015 at 10:12:31AM +0800, Yuyang Du wrote: > On Mon, Oct 12, 2015 at 11:12:06AM +0200, Peter Zijlstra wrote:
> > So in the old code we had 'magic' to deal with the case where a cgroup > > was consuming less than 1 cpu's worth of runtime. For example, a single > > task running in the group. > > > > In that scenario it might be possible that the group entity weight: > > > > se->weight = (tg->shares * cfs_rq->weight) / tg->weight; > > > > Strongly deviates from the tg->shares; you want the single task reflect > > the full group shares to the next level; due to the whole distributed > > approximation stuff. > > Yeah, I thought so. > > > I see you've deleted all that code; see the former > > __update_group_entity_contrib(). > > Probably not there, it actually was an icky way to adjust things. Yeah, no argument there. > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 4df37a4..b184da0 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -2370,7 +2370,7 @@ static inline long calc_tg_weight(struct task_group > *tg, struct cfs_rq *cfs_rq) > */ > tg_weight = atomic_long_read(&tg->load_avg); > tg_weight -= cfs_rq->tg_load_avg_contrib; > - tg_weight += cfs_rq_load_avg(cfs_rq); > + tg_weight += cfs_rq->load.weight; > > return tg_weight; > } > @@ -2380,7 +2380,7 @@ static long calc_cfs_shares(struct cfs_rq *cfs_rq, > struct task_group *tg) > long tg_weight, load, shares; > > tg_weight = calc_tg_weight(tg, cfs_rq); > - load = cfs_rq_load_avg(cfs_rq); > + load = cfs_rq->load.weight; > > shares = (tg->shares * load); > if (tg_weight) Aah, yes very much so. I completely overlooked that :-( When calculating shares we very much want the current load, not the load average. Also, should we do the below? At this point se->on_rq is still 0 so reweight_entity() will not update (dequeue/enqueue) the accounting, but we'll have just accounted the 'old' load.weight. Doing it this way around we'll first update the weight and then account it, which seems more accurate. --- diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 700eb548315f..d2efef565aed 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3009,8 +3009,8 @@ enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags) */ update_curr(cfs_rq); enqueue_entity_load_avg(cfs_rq, se); - account_entity_enqueue(cfs_rq, se); update_cfs_shares(cfs_rq); + account_entity_enqueue(cfs_rq, se); if (flags & ENQUEUE_WAKEUP) { place_entity(cfs_rq, se, 0); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/