On Mon, 2019-07-01 at 12:29 -0400, Josef Bacik wrote: > On Fri, Jun 28, 2019 at 04:49:06PM -0400, Rik van Riel wrote: > > The runnable_load magic is used to quickly propagate information > > about > > runnable tasks up the hierarchy of runqueues. The runnable_load_avg > > is > > mostly used for the load balancing code, which only examines the > > value at > > the root cfs_rq. > > > > Redefine the root cfs_rq runnable_load_avg to be the sum of > > task_h_loads > > of the runnable tasks. This works because the hierarchical > > runnable_load of > > a task is already equal to the task_se_h_load today. This provides > > enough > > information to the load balancer. > > > > The runnable_load_avg of the cgroup cfs_rqs does not appear to be > > used for anything, so don't bother calculating those. > > > > This removes one of the things that the code currently traverses > > the > > cgroup hierarchy for, and getting rid of it brings us one step > > closer > > to a flat runqueue for the CPU controller. > > > > My memory on this stuff is very hazy, but IIRC we had the > runnable_sum and the > runnable_avg separated out because you could have the avg lag behind > the sum. > So like you enqueue a bunch of new entities who's avg may have > decayed a bunch > and so their overall load is not felt on the CPU until they start > running, and > now you've overloaded that CPU. The sum was there to make sure new > things > coming onto the CPU added actual load to the queue instead of looking > like there > was no load. > > Is this going to be a problem now with this new code?
That is a good question! On the one hand, you may well be right. On the other hand, while I see the old code calculating runnable_sum, I don't really see it _using_ it to drive scheduling decisions. It would be easy to define the CPU cfs_rq->runnable_load_sum as being the sum of task_se_h_weight() of each runnable task on the CPU (for example), but what would we use it for? What am I missing? > +static inline void > > +update_runnable_load_avg(struct sched_entity *se) > > +{ > > + struct cfs_rq *root_cfs_rq = &cfs_rq_of(se)->rq->cfs; > > + long new_h_load, delta; > > + > > + SCHED_WARN_ON(!entity_is_task(se)); > > + > > + if (!se->on_rq) > > + return; > > > > - sub_positive(&cfs_rq->avg.runnable_load_avg, se- > > >avg.runnable_load_avg); > > - sub_positive(&cfs_rq->avg.runnable_load_sum, > > - se_runnable(se) * se->avg.runnable_load_sum); > > + new_h_load = task_se_h_load(se); > > + delta = new_h_load - se->enqueued_h_load; > > + root_cfs_rq->avg.runnable_load_avg += delta; > > Should we use add_positive here? Thanks, Yes, we should use add_positive. I'll do that for v3. -- All Rights Reversed.
signature.asc
Description: This is a digitally signed message part