On Wed, Jan 31, 2018 at 9:50 AM, Rohit Jain <rohit.k.j...@oracle.com> wrote: >>>>> kernel/sched/fair.c | 38 ++++++++++++++++++++++++++++---------- >>>>> 1 file changed, 28 insertions(+), 10 deletions(-) >>>>> >>>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c >>>>> index 26a71eb..ce5ccf8 100644 >>>>> --- a/kernel/sched/fair.c >>>>> +++ b/kernel/sched/fair.c >>>>> @@ -5625,6 +5625,11 @@ static unsigned long capacity_orig_of(int cpu) >>>>> return cpu_rq(cpu)->cpu_capacity_orig; >>>>> } >>>>> >>>>> +static inline bool full_capacity(int cpu) >>>>> +{ >>>>> + return capacity_of(cpu) >= (capacity_orig_of(cpu)*3)/4; >>>>> +} >>>>> + >>>>> static unsigned long cpu_avg_load_per_task(int cpu) >>>>> { >>>>> struct rq *rq = cpu_rq(cpu); >>>>> @@ -6081,7 +6086,7 @@ static int select_idle_core(struct task_struct >>>>> *p, >>>>> struct sched_domain *sd, int >>>>> >>>>> for_each_cpu(cpu, cpu_smt_mask(core)) { >>>>> cpumask_clear_cpu(cpu, cpus); >>>>> - if (!idle_cpu(cpu)) >>>>> + if (!idle_cpu(cpu) || !full_capacity(cpu)) >>>>> idle = false; >>>>> } >>>> >>>> There's some difference in logic between select_idle_core and >>>> select_idle_cpu as far as the full_capacity stuff you're adding goes. >>>> In select_idle_core, if all CPUs are !full_capacity, you're returning >>>> -1. But in select_idle_cpu you're returning the best idle CPU that's >>>> the most cap among the !full_capacity ones. Why there is this >>>> different in logic? Did I miss something? >>>> > > <snip> > >> Dude :) That is hardly an answer to the question I asked. Hint: >> *different in logic*. > > > Let me re-try :) > > For select_idle_core, we are doing a search for a fully idle and full > capacity core, the fail-safe is select_idle_cpu because we will re-scan > the CPUs. The notion is to select an idle CPU no matter what, because > being on an idle CPU is better than waiting on a non-idle one. In > select_idle_core you can be slightly picky about the core because > select_idle_cpu is a fail safe. I measured the performance impact of > choosing the "best among low cap" vs the code changes I have (for > select_idle_core) and could not find a statistically significant impact, > hence went with the simpler code changes.
That's Ok with me. Just that I remember Peter messing with this path and that it was expensive to scan too much for some systems. The other thing is you're really doing to do a "fail safe" as you call it search here with SIS_PROP set. Do you see a difference in perf when doing the same approach as you took in select_idle_core? Peter, are you with the approach Rohit has adopted to pick best capacity idle CPU in select_idle_cpu? I guess nr--; will bail out early if we have SIS_PROP set, incase the scan cost gets too much but then again we might end scanning too few CPUs. thanks, - Joel