* Vincent Guittot <vincent.guit...@linaro.org> wrote:
> The utilization of the CPU by rt, dl and interrupts are now tracked with > PELT so we can use these metrics instead of rt_avg to evaluate the remaining > capacity available for cfs class. > > scale_rt_capacity() behavior has been changed and now returns the remaining > capacity available for cfs instead of a scaling factor because rt, dl and > interrupt provide now absolute utilization value. > > The same formula as schedutil is used: > irq util_avg + (1 - irq util_avg / max capacity ) * /Sum rq util_avg > but the implementation is different because it doesn't return the same value > and doesn't benefit of the same optimization > > Cc: Ingo Molnar <mi...@redhat.com> > Cc: Peter Zijlstra <pet...@infradead.org> > Signed-off-by: Vincent Guittot <vincent.guit...@linaro.org> > --- > kernel/sched/deadline.c | 2 -- > kernel/sched/fair.c | 41 +++++++++++++++++++---------------------- > kernel/sched/pelt.c | 2 +- > kernel/sched/rt.c | 2 -- > 4 files changed, 20 insertions(+), 27 deletions(-) > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index d2758e3..ce0dcbf 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -7550,39 +7550,36 @@ static inline int get_sd_load_idx(struct sched_domain > *sd, > static unsigned long scale_rt_capacity(int cpu) > { > struct rq *rq = cpu_rq(cpu); > - u64 total, used, age_stamp, avg; > - s64 delta; > - > - /* > - * Since we're reading these variables without serialization make sure > - * we read them once before doing sanity checks on them. > - */ > - age_stamp = READ_ONCE(rq->age_stamp); > - avg = READ_ONCE(rq->rt_avg); > - delta = __rq_clock_broken(rq) - age_stamp; > + unsigned long max = arch_scale_cpu_capacity(NULL, cpu); > + unsigned long used, irq, free; > > - if (unlikely(delta < 0)) > - delta = 0; > +#if defined(CONFIG_IRQ_TIME_ACCOUNTING) || > defined(CONFIG_PARAVIRT_TIME_ACCOUNTING) > + irq = READ_ONCE(rq->avg_irq.util_avg); > > - total = sched_avg_period() + delta; > + if (unlikely(irq >= max)) > + return 1; > +#endif Note that 'irq' is unused outside that macro block, resulting in a new warning on defconfig builds: CC kernel/sched/fair.o kernel/sched/fair.c: In function ‘scale_rt_capacity’: kernel/sched/fair.c:7553:22: warning: unused variable ‘irq’ [-Wunused-variable] unsigned long used, irq, free; ^~~ I have applied the delta fix below for simplicity, but what we really want is a cleanup of that function to eliminate the #ifdefs. One solution would be to factor out the 'irq' utilization value into a helper inline, and double check that if the configs are off the compiler does the right thing and eliminates this identity transformation for the irq==0 case: free *= (max - irq); free /= max; If the compiler refuses to optimize this away (due to the zero and overflow cases), try to find something more clever? Thanks, Ingo kernel/sched/fair.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index e3221db0511a..d5f7d521e448 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -7550,7 +7550,10 @@ static unsigned long scale_rt_capacity(int cpu) { struct rq *rq = cpu_rq(cpu); unsigned long max = arch_scale_cpu_capacity(NULL, cpu); - unsigned long used, irq, free; + unsigned long used, free; +#if defined(CONFIG_IRQ_TIME_ACCOUNTING) || defined(CONFIG_PARAVIRT_TIME_ACCOUNTING) + unsigned long irq; +#endif #if defined(CONFIG_IRQ_TIME_ACCOUNTING) || defined(CONFIG_PARAVIRT_TIME_ACCOUNTING) irq = READ_ONCE(rq->avg_irq.util_avg);