* Ken Chen <[EMAIL PROTECTED]> wrote:

> We recently discovered a nasty performance bug in the kernel CPU load 
> balancer where we were hit by 50% performance regression.
> 
> When tasks are assigned to a subset of CPUs that span across 
> sched_domains (either ccNUMA node or the new multi-core domain) via 
> cpu affinity, kernel fails to perform proper load balance at these 
> domains, due to several logic in find_busiest_group() miss identified 
> busiest sched group within a given domain. This leads to inadequate 
> load balance and causes 50% performance hit.
[...]
> So proposing the following fix: add addition logic in 
> find_busiest_group to detect intrinsic imbalance within the busiest 
> group.  When such condition is detected, load balance goes into spread 
> mode instead of default grouping mode.

thanks - i've added your fix to the scheduler queue, and i'll check it 
with a few workloads too. (Right now the scheduler queue is blocked by a 
showstopper crasher bug in group scheduling and we are trying to fix 
that first, before doing any other change.)

        Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to