From: Dietmar Eggemann <dietmar.eggem...@arm.com>

Since now we have besides frequency invariant also cpu (uarch plus max
system frequency) invariant cfs_rq::utilization_load_avg both, frequency
and cpu scaling happens as part of the load tracking.
So cfs_rq::utilization_load_avg does not have to be scaled by the original
capacity of the cpu again.

Cc: Ingo Molnar <mi...@redhat.com>
Cc: Peter Zijlstra <pet...@infradead.org>

Signed-off-by: Dietmar Eggemann <dietmar.eggem...@arm.com>
---
 kernel/sched/fair.c | 34 +++++++++++++++++++++-------------
 1 file changed, 21 insertions(+), 13 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 39871a4..0c08cff 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5013,32 +5013,40 @@ static int select_idle_sibling(struct task_struct *p, 
int target)
 done:
        return target;
 }
+
 /*
  * get_cpu_usage returns the amount of capacity of a CPU that is used by CFS
  * tasks. The unit of the return value must be the one of capacity so we can
  * compare the usage with the capacity of the CPU that is available for CFS
  * task (ie cpu_capacity).
+ *
  * cfs.utilization_load_avg is the sum of running time of runnable tasks on a
  * CPU. It represents the amount of utilization of a CPU in the range
- * [0..SCHED_LOAD_SCALE].  The usage of a CPU can't be higher than the full
- * capacity of the CPU because it's about the running time on this CPU.
- * Nevertheless, cfs.utilization_load_avg can be higher than SCHED_LOAD_SCALE
- * because of unfortunate rounding in avg_period and running_load_avg or just
- * after migrating tasks until the average stabilizes with the new running
- * time. So we need to check that the usage stays into the range
- * [0..cpu_capacity_orig] and cap if necessary.
- * Without capping the usage, a group could be seen as overloaded (CPU0 usage
- * at 121% + CPU1 usage at 80%) whereas CPU1 has 20% of available capacity
+ * [0..capacity_orig] where capacity_orig is the cpu_capacity available at the
+ * highest frequency (arch_scale_freq_capacity()). The usage of a CPU converges
+ * towards a sum equal to or less than the current capacity (capacity_curr <=
+ * capacity_orig) of the CPU because it is the running time on this CPU scaled
+ * by capacity_curr. Nevertheless, cfs.utilization_load_avg can be higher than
+ * capacity_curr or even higher than capacity_orig because of unfortunate
+ * rounding in avg_period and running_load_avg or just after migrating tasks
+ * (and new task wakeups) until the average stabilizes with the new running
+ * time. We need to check that the usage stays into the range
+ * [0..capacity_orig] and cap if necessary. Without capping the usage, a group
+ * could be seen as overloaded (CPU0 usage at 121% + CPU1 usage at 80%) whereas
+ * CPU1 has 20% of available capacity. We allow usage to overshoot
+ * capacity_curr (but not capacity_orig) as it useful for predicting the
+ * capacity required after task migrations (scheduler-driven DVFS).
  */
+
 static int get_cpu_usage(int cpu)
 {
        unsigned long usage = cpu_rq(cpu)->cfs.utilization_load_avg;
-       unsigned long capacity = capacity_orig_of(cpu);
+       unsigned long capacity_orig = capacity_orig_of(cpu);
 
-       if (usage >= SCHED_LOAD_SCALE)
-               return capacity;
+       if (usage >= capacity_orig)
+               return capacity_orig;
 
-       return (usage * capacity) >> SCHED_LOAD_SHIFT;
+       return usage;
 }
 
 /*
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to