On Wednesday 07 Nov 2018 at 11:47:09 (+0100), Dietmar Eggemann wrote: > The important bit for EAS is that it only uses utilization in the > non-overutilized case. Here, utilization signals should look the same > between the two approaches, not considering tasks with long periods like the > 39/80ms example above. > There are also some advantages for EAS with time scaling: (1) faster > overutilization detection when a big task runs on a little CPU, (2) higher > (initial) task utilization value when this task migrates from little to big > CPU.
Agreed, these patches should help detecting the over-utilized scenarios faster and more reliably, which is probably a good thing. I'll try to have a look in more details soon. > We should run our EAS task placement tests with your time scaling patches. Right, I tried these patches with the synthetic tests we usually run against our upstream EAS dev branch (see [1]), and I couldn't see any regression, which is good sign :-) <slightly off topic> Since most people are probably not familiar with these tests, I'll try to elaborate a little bit more. They are unit tests aimed to stress particular behaviours of the scheduler on asymmetric platforms. More precisely, they check that capacity-awareness/misfit and EAS are actually able to up-migrate and down-migrate tasks between big and little CPUs when necessary. The tests are based on rt-app and ftrace. They basically run a whole lot of scenarios with rt-app (small tasks, big tasks, a mix of both, tasks changing behaviour, ramping up, ramping down, ...), pull a trace of the execution and check that: 1. the task(s) did not miss activations (which will basically be true only if the scheduler managed to provide each task with enough CPU capacity). We call that one 'test_slack'; 2. the task placement is close enough to the optimal placement energy-wise (which is computed off-line using the energy model and the rt-app conf). We call that one 'test_task_placement'. For example, in order to pass the test, a periodic task that ramps up from 10% to 70% over (say) 5s should probably start its execution on little CPUs to not waste energy, and get up-migrated to big CPUs later on to not miss activations. Otherwise one of the two checks will fail. I'd like to emphasize that these test scenarios are *not* supposed to look like real workloads at all. They've be design with the sole purpose of stressing specific code paths of the scheduler to spot any obvious breakage. They've proven quite useful for us in the past. All the tests are publicly available in the LISA repo [2]. </slightly off topic> So, to come back to Vincent's patches, I managed to get 10/10 pass rate to most of the tests referred to as 'generic' in [1] on my Juno r0. The kernel I tested had Morten's misfit patches, the EAS patches v8, and Vincent's patches on top. Although I still need to really get my head around all the implications of changing PELT like that, I cannot see any obvious red flags from the testing perspective here. Thanks, Quentin --- [1] https://developer.arm.com/open-source/energy-aware-scheduling/eas-mainline-development [2] https://github.com/ARM-software/lisa