On 10/16/19 5:26 PM, Vincent Guittot wrote: > On Wed, 16 Oct 2019 at 09:21, Parth Shah <pa...@linux.ibm.com> wrote: >> >> >> >> On 9/19/19 1:03 PM, Vincent Guittot wrote: >> >> [...] >> >>> Signed-off-by: Vincent Guittot <vincent.guit...@linaro.org> >>> --- >>> kernel/sched/fair.c | 585 >>> ++++++++++++++++++++++++++++++++++------------------ >>> 1 file changed, 380 insertions(+), 205 deletions(-) >>> >>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c >>> index 017aad0..d33379c 100644 >>> --- a/kernel/sched/fair.c >>> +++ b/kernel/sched/fair.c >>> @@ -7078,11 +7078,26 @@ static unsigned long __read_mostly >>> max_load_balance_interval = HZ/10; >>> >>> enum fbq_type { regular, remote, all }; >>> >>> +/* >>> + * group_type describes the group of CPUs at the moment of the load >>> balance. >>> + * The enum is ordered by pulling priority, with the group with lowest >>> priority >>> + * first so the groupe_type can be simply compared when selecting the >>> busiest >>> + * group. see update_sd_pick_busiest(). >>> + */ >>> enum group_type { >>> - group_other = 0, >>> + group_has_spare = 0, >>> + group_fully_busy, >>> group_misfit_task, >>> + group_asym_packing, >>> group_imbalanced, >>> - group_overloaded, >>> + group_overloaded >>> +}; >>> + >>> +enum migration_type { >>> + migrate_load = 0, >>> + migrate_util, >>> + migrate_task, >>> + migrate_misfit >>> }; >>> >>> #define LBF_ALL_PINNED 0x01 >>> @@ -7115,7 +7130,7 @@ struct lb_env { >>> unsigned int loop_max; >>> >>> enum fbq_type fbq_type; >>> - enum group_type src_grp_type; >>> + enum migration_type balance_type; >>> struct list_head tasks; >>> }; >>> >>> @@ -7347,7 +7362,7 @@ static int detach_tasks(struct lb_env *env) >>> { >>> struct list_head *tasks = &env->src_rq->cfs_tasks; >>> struct task_struct *p; >>> - unsigned long load; >>> + unsigned long util, load; >>> int detached = 0; >>> >>> lockdep_assert_held(&env->src_rq->lock); >>> @@ -7380,19 +7395,53 @@ static int detach_tasks(struct lb_env *env) >>> if (!can_migrate_task(p, env)) >>> goto next; >>> >>> - load = task_h_load(p); >>> + switch (env->balance_type) { >>> + case migrate_load: >>> + load = task_h_load(p); >>> >>> - if (sched_feat(LB_MIN) && load < 16 && >>> !env->sd->nr_balance_failed) >>> - goto next; >>> + if (sched_feat(LB_MIN) && >>> + load < 16 && !env->sd->nr_balance_failed) >>> + goto next; >>> >>> - if ((load / 2) > env->imbalance) >>> - goto next; >>> + if ((load / 2) > env->imbalance) >>> + goto next; >>> + >>> + env->imbalance -= load; >>> + break; >>> + >>> + case migrate_util: >>> + util = task_util_est(p); >>> + >>> + if (util > env->imbalance) >> >> Can you please explain what would happen for >> `if (util/2 > env->imbalance)` ? >> just like when migrating load, even util shouldn't be migrated if >> env->imbalance is just near the utilization of the task being moved, isn't >> it? > > I have chosen uti and not util/2 to be conservative because > migrate_util is used to fill spare capacity. > With `if (util/2 > env->imbalance)`, we can more easily overload the > local group or pick too much utilization from the overloaded group. > fair enough. I missed the point that unlike migrate_load, with migrate_util, env->imbalance is just spare capacity of the local group. Thanks, Parth >> >>> + goto next; >>> + >>> + env->imbalance -= util; >>> + break; >>> +[ ... ] >> >> Thanks, >> Parth >>