* Vincent Guittot <vincent.guit...@linaro.org> wrote:

> The utilization of the CPU by rt, dl and interrupts are now tracked with
> PELT so we can use these metrics instead of rt_avg to evaluate the remaining
> capacity available for cfs class.
> 
> scale_rt_capacity() behavior has been changed and now returns the remaining
> capacity available for cfs instead of a scaling factor because rt, dl and
> interrupt provide now absolute utilization value.
> 
> The same formula as schedutil is used:
>   irq util_avg + (1 - irq util_avg / max capacity ) * /Sum rq util_avg
> but the implementation is different because it doesn't return the same value
> and doesn't benefit of the same optimization
> 
> Cc: Ingo Molnar <mi...@redhat.com>
> Cc: Peter Zijlstra <pet...@infradead.org>
> Signed-off-by: Vincent Guittot <vincent.guit...@linaro.org>
> ---
>  kernel/sched/deadline.c |  2 --
>  kernel/sched/fair.c     | 41 +++++++++++++++++++----------------------
>  kernel/sched/pelt.c     |  2 +-
>  kernel/sched/rt.c       |  2 --
>  4 files changed, 20 insertions(+), 27 deletions(-)

> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index d2758e3..ce0dcbf 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -7550,39 +7550,36 @@ static inline int get_sd_load_idx(struct sched_domain 
> *sd,
>  static unsigned long scale_rt_capacity(int cpu)
>  {
>       struct rq *rq = cpu_rq(cpu);
> -     u64 total, used, age_stamp, avg;
> -     s64 delta;
> -
> -     /*
> -      * Since we're reading these variables without serialization make sure
> -      * we read them once before doing sanity checks on them.
> -      */
> -     age_stamp = READ_ONCE(rq->age_stamp);
> -     avg = READ_ONCE(rq->rt_avg);
> -     delta = __rq_clock_broken(rq) - age_stamp;
> +     unsigned long max = arch_scale_cpu_capacity(NULL, cpu);
> +     unsigned long used, irq, free;
>  
> -     if (unlikely(delta < 0))
> -             delta = 0;
> +#if defined(CONFIG_IRQ_TIME_ACCOUNTING) || 
> defined(CONFIG_PARAVIRT_TIME_ACCOUNTING)
> +     irq = READ_ONCE(rq->avg_irq.util_avg);
>  
> -     total = sched_avg_period() + delta;
> +     if (unlikely(irq >= max))
> +             return 1;
> +#endif

Note that 'irq' is unused outside that macro block, resulting in a new warning 
on 
defconfig builds:

 CC      kernel/sched/fair.o
 kernel/sched/fair.c: In function ‘scale_rt_capacity’:
 kernel/sched/fair.c:7553:22: warning: unused variable ‘irq’ [-Wunused-variable]
   unsigned long used, irq, free;
                       ^~~

I have applied the delta fix below for simplicity, but what we really want is a 
cleanup of that function to eliminate the #ifdefs. One solution would be to 
factor 
out the 'irq' utilization value into a helper inline, and double check that if 
the 
configs are off the compiler does the right thing and eliminates this identity 
transformation for the irq==0 case:

        free *= (max - irq);
        free /= max;

If the compiler refuses to optimize this away (due to the zero and overflow 
cases), try to find something more clever?

Thanks,

        Ingo

 kernel/sched/fair.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index e3221db0511a..d5f7d521e448 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7550,7 +7550,10 @@ static unsigned long scale_rt_capacity(int cpu)
 {
        struct rq *rq = cpu_rq(cpu);
        unsigned long max = arch_scale_cpu_capacity(NULL, cpu);
-       unsigned long used, irq, free;
+       unsigned long used, free;
+#if defined(CONFIG_IRQ_TIME_ACCOUNTING) || 
defined(CONFIG_PARAVIRT_TIME_ACCOUNTING)
+       unsigned long irq;
+#endif
 
 #if defined(CONFIG_IRQ_TIME_ACCOUNTING) || 
defined(CONFIG_PARAVIRT_TIME_ACCOUNTING)
        irq = READ_ONCE(rq->avg_irq.util_avg);

Reply via email to