On 2021/3/22 15:48, Peter Zijlstra wrote: > On Sun, Mar 21, 2021 at 09:34:00PM +0800, Li, Aubrey wrote: >> Hi Peter, >> >> On 2021/3/20 23:34, Peter Zijlstra wrote: >>> On Fri, Mar 19, 2021 at 04:32:48PM -0400, Joel Fernandes (Google) wrote: >>>> @@ -7530,8 +7543,9 @@ int can_migrate_task(struct task_struct *p, struct >>>> lb_env *env) >>>> * We do not migrate tasks that are: >>>> * 1) throttled_lb_pair, or >>>> * 2) cannot be migrated to this CPU due to cpus_ptr, or >>>> - * 3) running (obviously), or >>>> - * 4) are cache-hot on their current CPU. >>>> + * 3) task's cookie does not match with this CPU's core cookie >>>> + * 4) running (obviously), or >>>> + * 5) are cache-hot on their current CPU. >>>> */ >>>> if (throttled_lb_pair(task_group(p), env->src_cpu, env->dst_cpu)) >>>> return 0; >>>> @@ -7566,6 +7580,13 @@ int can_migrate_task(struct task_struct *p, struct >>>> lb_env *env) >>>> return 0; >>>> } >>>> >>>> + /* >>>> + * Don't migrate task if the task's cookie does not match >>>> + * with the destination CPU's core cookie. >>>> + */ >>>> + if (!sched_core_cookie_match(cpu_rq(env->dst_cpu), p)) >>>> + return 0; >>>> + >>>> /* Record that we found atleast one task that could run on dst_cpu */ >>>> env->flags &= ~LBF_ALL_PINNED; >>>> >>> >>> This one is too strong.. persistent imbalance should be able to override >>> it. >>> >> >> IIRC, this change can avoid the following scenario: >> >> One sysbench cpu thread(cookieA) and sysbench mysql thread(cookieB) running >> on the two siblings of core_1, the other sysbench cpu thread(cookieA) and >> sysbench mysql thread(cookieB) running on the two siblings of core2, which >> causes 50% force idle. >> >> This is not an imbalance case. > > But suppose there is an imbalance; then this cookie crud can forever > stall balance. > > Imagine this cpu running a while(1); with a uniqie cookie on, then it > will _never_ accept other tasks == BAD. >
How about putting the following check in sched_core_cookie_match()? + /* + * Ignore cookie match if there is a big imbalance between the src rq + * and dst rq. + */ + if ((src_rq->cfs.h_nr_running - rq->cfs.h_nr_running) > 1) + return true; This change has significant impact of my sysbench cpu+mysql colocation. - with this change, sysbench cpu tput = 2796 events/s, sysbench mysql = 1315 events/s - without it, sysbench cpu tput= 3513 events/s, sysbench mysql = 646 events. Do you have any suggestions before we drop it? Thanks, -Aubrey