On 26/09/18 11:33, Vincent Guittot wrote: > On Wed, 26 Sep 2018 at 11:35, Valentin Schneider > [...] >>> Can you give us details about the use case that you care about ? >>> >> >> It's the same as I presented last week - devlib (some python target >> communication > > ok. you mean at linaro connect >
Yeah, sorry. >> library I use) has some phase where it spawns at lot of tasks at once to do >> some setup (busybox, shutils, bash...). Some of those tasks are pinned to a >> particular CPU, and that can lead to failed load_balance() - and to make >> things >> worse, there's a lot of idle_balance() in there. >> >> Eventually when I start running my actual workload a few ~100ms later, it's >> impacted by that balance_interval increase. >> >> Admittedly that's a specific use-case, but I don't think this quick increase >> is something that was intended. > > Yes, this really sounds like a specific use-case. Unluckily you find a > way to reach max interval quite easily/every time with your test > set-up but keep in mind that this can also happen in real system life > and without using the newly idle path. > So if it's a problem to have a interval at max value for your unitary > test, it probably means that it's a problem for the system and the max > value is too high > Limiting the max value for those tests is actually a good point, and I think I'll give it a shot. However... > Taking advantage of all load_balance event to update the interval > makes sense to me. It seems that you care about a short and regular > balance interval more that minimizing overhead of load balancing. > At the opposite, i'm sure that you don't complain if newly idle load > balance resets the interval to min value and overwrite what the > periodic load balance set up previously :-) > ...My concern is more about increasing balance_interval faster than we should. The proposed "fix" is to prevent any balance_interval increase when going through idle_balance(), but Patrick (added in cc) suggested offline that we could simply limit the rate at which we do these increases, so that they match what we do in rebalance_domains(). We'd still increase balance_interval on failed newidle load_balance(), but we wouldn't increase from min to max in e.g. 50ms. Would that work better for you? >> [...]