On 25 May 2016 at 19:40, Dietmar Eggemann <dietmar.eggem...@arm.com> wrote: > Hi Vincent, > > On 24/05/16 10:55, Vincent Guittot wrote: >> Fix the insertion of cfs_rq in rq->leaf_cfs_rq_list to ensure that >> a child will always called before its parent. >> >> The hierarchical order in shares update list has been introduced by >> commit 67e86250f8ea ("sched: Introduce hierarchal order on shares update >> list") >> >> With the current implementation a child can be still put after its parent. >> >> Lets take the example of >> root >> \ >> b >> /\ >> c d* >> | >> e* >> >> with root -> b -> c already enqueued but not d -> e so the leaf_cfs_rq_list >> looks like: head -> c -> b -> root -> tail >> >> The branch d -> e will be added the first time that they are enqueued, >> starting with e then d. >> >> When e is added, its parents is not already on the list so e is put at the >> tail : head -> c -> b -> root -> e -> tail >> >> Then, d is added at the head because its parent is already on the list: >> head -> d -> c -> b -> root -> e -> tail >> >> e is not placed at the right position and will be called the last whereas >> it should be called at the beginning. >> >> Because it follows the bottom-up enqueue sequence, we are sure that we >> will finished to add either a cfs_rq without parent or a cfs_rq with a parent >> that is already on the list. We can use this event to detect when we have >> finished to add a new branch. For the others, whose parents are not already >> added, we have to ensure that they will be added after their children that >> have just been inserted the steps before, and after any potential parents >> that >> are already in the list. The easiest way is to put the cfs_rq just after the >> last inserted one and to keep track of it untl the branch is fully added. >> >> Signed-off-by: Vincent Guittot <vincent.guit...@linaro.org> > > So the use case for this would be you create a multi-level task group > hierarchy on a cpu (e.g. tg_css_id=2,4,6 on cpu=1) and let a task run in > the highest level task group (tg_css_id=6). > > list_add_leaf_cfs_rq() call: > > ... > enqueue_task_fair: cpu=1 tg_css_id=6 cfs_rq=0xffffffc97500f700 on_list=0 > enqueue_task_fair: cpu=1 tg_css_id=4 cfs_rq=0xffffffc975175900 on_list=0 > enqueue_task_fair: cpu=1 tg_css_id=2 cfs_rq=0xffffffc975866d00 on_list=0 > enqueue_task_fair: cpu=1 tg_css_id=1 cfs_rq=0xffffffc97fec44e8 on_list=1 > > ... > > In this case, the for_each_leaf_cfs_rq() in update_blocked_averages() > iterates in the tg_css_id=2,1,6,4 order: > > ... > update_blocked_averages: tg_css_id=2 cfs_rq=0xffffffc975866d00 on_list=1 > update_blocked_averages: tg_css_id=1 cfs_rq=0xffffffc97fec44e8 on_list=1 > update_blocked_averages: tg_css_id=6 cfs_rq=0xffffffc97500f700 on_list=1 > update_blocked_averages: tg_css_id=4 cfs_rq=0xffffffc975175900 on_list=1 > ... > > which I guess breaks the update_tg_cfs_util() call you introduced in > update_blocked_averages() in '[RFC PATCH v2] sched: reflect sched_entity > movement into task_group's utilization'. IMHO, otherwise > update_blocked_averages() can deal with the list not being ordered > tg_css_id=6,4,2,1. > > https://lkml.org/lkml/2016/5/24/200 > > The use of for_each_leaf_cfs_rq() in update_shares() is gone. Do the > remaining call sites (update_runtime_enabled(), > unthrottle_offline_cfs_rqs() require any ordering?
I can see one potential issue with unthrottle_offline_cfs_rqs() which goes to the hierarchy when unthrottling. Otherwise i don't see any other dependency with ordering the cfs_rq than the patch i have sent for task group utilization The other point is this patch align the code with the comment > > [...]