On Mon, Aug 28, 2017 at 10:41:55AM +0200, Peter Zijlstra wrote:
> On Fri, Aug 25, 2017 at 12:11:31AM +0200, Uladzislau Rezki (Sony) wrote:
> > From: Uladzislau Rezki <[email protected]>
> > 
> > As a first step this patch makes cfs_tasks list as MRU one.
> > It means, that when a next task is picked to run on physical
> > CPU it is moved to the front of the list.
> > 
> > Thefore, the cfs_tasks list is more or less sorted (except woken
> > tasks) starting from recently given CPU time tasks toward tasks
> > with max wait time in a run-queue, i.e. MRU list.
> > 
> > Second, as part of the load balance operation, this approach
> > starts detach_tasks()/detach_one_task() from the tail of the
> > queue instead of the head, giving some advantages:
> > 
> > - tends to pick a task with highest wait time;
> > - tasks located in the tail are less likely cache-hot,
> >   therefore the can_migrate_task() decision is higher.
> > 
> > hackbench illustrates slightly better performance. For example
> > doing 1000 samples and 40 groups on i5-3320M CPU, it shows below
> > figures:
> > 
> > default: 0.644 avg
> > patched: 0.637 avg
> > 
> > Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
> > ---
> >  kernel/sched/fair.c | 19 ++++++++++++++-----
> >  1 file changed, 14 insertions(+), 5 deletions(-)
> > 
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index c77e4b1d51c0..cda281c6bb29 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -6357,7 +6357,7 @@ pick_next_task_fair(struct rq *rq, struct task_struct 
> > *prev, struct rq_flags *rf
> >     if (hrtick_enabled(rq))
> >             hrtick_start_fair(rq, p);
> >  
> > -   return p;
> > +   goto done;
> >  simple:
> >     cfs_rq = &rq->cfs;
> >  #endif
> > @@ -6378,6 +6378,14 @@ pick_next_task_fair(struct rq *rq, struct 
> > task_struct *prev, struct rq_flags *rf
> >     if (hrtick_enabled(rq))
> >             hrtick_start_fair(rq, p);
> >  
> > +done: __maybe_unused
> > +   /*
> > +    * Move the next running task to the front of
> > +    * the list, so our cfs_tasks list becomes MRU
> > +    * one.
> > +    */
> > +   list_move(&se->group_node, &rq->cfs_tasks);
> > +
> >     return p;
> >  
> >  idle:
> 
> Could you also run something like:
> 
> $ taskset 1 perf bench sched pipe
> 
> to make sure the added list_move() doesn't hurt, I'm not sure group_node
> and cfs_tasks are in cachelines we already touch for that operation.
> 
> And if you can see that list_move() hurt in "perf annotate", try moving
> those members around to lines that we already need anyway.
@Peter: just in case if you missed my email. I uploaded one more patch
provided where i provided latest result as well. Please have a look
at following links:

https://lkml.org/lkml/2017/9/13/167 
https://lkml.org/lkml/2017/9/13/168

Best Regards,
Uladzislau Rezki

Reply via email to