Le Friday 07 Sep 2018 à 13:37:49 (+0200), Peter Zijlstra a écrit : > On Fri, Sep 07, 2018 at 09:51:04AM +0200, Vincent Guittot wrote: > > It can happen that load_balance finds a busiest group and then a busiest rq > > but the calculated imbalance is in fact null. > > Cute. Does that happen often?
I have a use case with RT tasks that reproduces the problem regularly. It happens at least when we have CPUs with different capacity either because of heterogeous CPU or because of RT/DL reducing available capacity for cfs I have put the call path that trigs the problem below and accroding to the comment it seems that we can reach similar state when playing with priority. > > > If the calculated imbalance is null, it's useless to try to find a busiest > > rq as no task will be migrated and we can return immediately. > > > > This situation can happen with heterogeneous system or smp system when RT > > tasks are decreasing the capacity of some CPUs. > > Is it the result of one of those "force_balance" conditions in > find_busiest_group() ? Should we not fix that to then return NULL > instead? The UC is: We have a newly_idle load balance that is triggered when RT task becomes idle ( but I think that I have seen that with idle load balance too) we trigs: if (env->idle != CPU_NOT_IDLE && group_has_capacity(env, local) && busiest->group_no_capacity) goto force_balance; In calculate_imbalance we use the path /* * Avg load of busiest sg can be less and avg load of local sg can * be greater than avg load across all sgs of sd because avg load * factors in sg capacity and sgs with smaller group_type are * skipped when updating the busiest sg: */ if (busiest->avg_load <= sds->avg_load || local->avg_load >= sds->avg_load) { env->imbalance = 0; return fix_small_imbalance(env, sds); } but fix_small_imbalance finally decides to return without modifying imbalance like here if (busiest->avg_load + scaled_busy_load_per_task >= local->avg_load + (scaled_busy_load_per_task * imbn)) { env->imbalance = busiest->load_per_task; return; } Beside this patch, I'm preparing another patch in fix small imbalance to ensure 1 task per CPU in similar situation but according to the comment above, we can reach this situation because of tasks priority