On Mon, May 23, 2016 at 01:12:07PM +0200, Mike Galbraith wrote: > On Mon, 2016-05-23 at 11:58 +0100, Morten Rasmussen wrote: > > wake_wide() is based on task wakee_flips of the waker and the wakee to > > decide whether an affine wakeup is desirable. On lightly loaded systems > > the waker is frequently the idle task (pid=0) which can accumulate a lot > > of wakee_flips in that scenario. It makes little sense to prevent affine > > wakeups on an idle cpu due to the idle task wakee_flips, so it makes > > more sense to ignore them in wake_wide(). > > You sure? What's the difference between a task flipping enough to > warrant spreading the load, and an interrupt source doing the same? > I've both witnessed firsthand, and received user confirmation of this > very thing improving utilization.
Right, I didn't consider the interrupt source scenario, my fault. The problem then seems to be distinguishing truly idle and busy doing interrupts. The issue that I observe is that wake_wide() likes pushing tasks around in lightly scenarios which isn't desirable for power management. Selecting the same cpu again may potentially let others reach deeper C-state. With that in mind I will if I can do better. Suggestions are welcome :-) > > > cc: Ingo Molnar <mi...@redhat.com> > > cc: Peter Zijlstra <pet...@infradead.org> > > > > Signed-off-by: Morten Rasmussen <morten.rasmus...@arm.com> > > --- > > kernel/sched/fair.c | 4 ++++ > > 1 file changed, 4 insertions(+) > > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > index c49e25a..0fe3020 100644 > > --- a/kernel/sched/fair.c > > +++ b/kernel/sched/fair.c > > @@ -5007,6 +5007,10 @@ static int wake_wide(struct task_struct *p) > > unsigned int slave = p->wakee_flips; > > int factor = this_cpu_read(sd_llc_size); > > > > + /* Don't let the idle task prevent affine wakeups */ > > + if (is_idle_task(current)) > > + return 0; > > + > > if (master < slave) > > swap(master, slave); > > if (slave < factor || master < slave * factor)