On 29/08/16 02:37, Yuyang Du wrote: > On Tue, Aug 23, 2016 at 04:39:51PM +0100, Dietmar Eggemann wrote: >> On 23/08/16 15:45, Vincent Guittot wrote: >>> On 23 August 2016 at 16:13, Peter Zijlstra <pet...@infradead.org> wrote: >>>> On Tue, Aug 23, 2016 at 03:28:19PM +0200, Vincent Guittot wrote: >>>>> I still wonder if using a flat util hierarchy is the right solution to >>>>> solve this problem with utilization and task group. I have noticed >>>>> exact same issues with load that generates weird task placement >>>>> decision and i think that we should probably try to solve both wrong >>>>> behavior with same mechanism. but this is not possible with flat >>>>> hierarchy for load >>>>> >>>>> Let me take an example. >>>>> TA is a always running task on CPU1 in group /root/level1/ >>>>> TB wakes up on CPU0 and moves TA into group /root/level2/ >>>>> Even if TA stays on CPU1, runnable_load_avg of CPU1 root cfs rq will >>>>> become 0. >>>> >>>> Because while we migrate the load_avg on /root/level2, we do not >>>> propagate the load_avg up the hierarchy? >>> >>> yes. At now, the load of a cfs_rq and the load of its sched_entity >>> that represents it at parent level are disconnected >> >> I guess you say 'disconnected' because cfs_rq and se (w/ cfs_rq eq. >> se->my_q) are now independent pelt signals where as before the rewrite >> they were 'connected' for load via __update_tg_runnable_avg(), >> __update_group_entity_contrib() in __update_entity_load_avg_contrib() >> and for utilization via 'se->avg.utilization_avg_contrib = >> group_cfs_rq(se)->utilization_load_avg' in >> __update_entity_utilization_avg_contrib(). > > I don't understand what exactly "disconnected" means, but with respect to > group_entity's load_avg, nothing is changed essentially: >
True but this is the update_cfs_shares() side of things. > group_entity_load_avg = my_cfs_rq_load_avg / tg_load_avg * tg_shares > 'Connected' for me in the old implementation stands for the fact that for every call to update_entity_load_avg(se, 1) (with !entity_is_task(se)), the group cfs_rq (se->my_q) contribution towards the se is updated in __update_entity_[load|utilization]_avg_contrib and the returning delta is added to the appropriate cfs_rq (se->cfs_rq) values immediately. Doing this in for_each_sched_entity(se) gives this nice propagation effect in direction root cfs_rq. In the new implementation se, se->my_q and se->cfs_rq have independent PELT signals, hence the 'disconnected'.