On Sun, 2 Mar 2008, Jeff Roberson wrote:
jeff 2008-03-02 08:20:59 UTC
FreeBSD src repository
Modified files:
sys/kern sched_ule.c
Log:
Add support for the new cpu topology api:
- When searching for affinity search backwards in the tree from the last
cpu we ran on while the thread still has affinity for the group. This
can take advantage of knowledge of shared L2 or L3 caches among a
group of cores.
- When searching for the least loaded cpu find the least loaded cpu via
the least loaded path through the tree. This load balances system bus
links, individual cache levels, and hyper-threaded/SMT cores.
- Make the periodic balancer recursively balance the highest and lowest
loaded cpu across each link.
Add support for cpusets:
- Convert the cpuset to a simple native cpumask_t while the kernel still
only supports cpumask.
- Pass the derived cpumask down through the cpu_search functions to
restrict the result cpus.
- Make the various steal functions resilient to failure since all threads
can not run on all cpus any longer.
General improvements:
- Precisely track the lowest priority thread on every runq with
tdq_setlowpri(). Before it was more advisory but this ended up having
pathological behaviors.
- Remove many #ifdef SMP conditions to simplify the code.
- Get rid of the old cumbersome tdq_group. This is more naturally
expressed via the cpu_group tree.
With these changes ULE is the only scheduler that supports the new cpuset
api. It succeeds on 4BSD but the scheduler doesn't obey the masks.
I don't presently have a plan to implement it on 4BSD as it will be
potentially very inefficient to search the runq for a compatible thread on
every context switch. I won't object if someone else wants to implement
this, otherwise I'll make the syscalls return ENOSYS if 4BSD is compiled
in.
The improved cpu topology load balancing is a mixed bag. On some
workloads we see considerable improvements. Right now mysql suffers when
it has large numbers of threads but other things seem much improved. I
will be continuing to tune this however and in most cases it's a win
already.
Kris has done some excellent benchmarking as usual. Here you can see the
improvement in postgres depending on various scheduler debug settings:
http://people.freebsd.org/~kris/scaling/pgsql-16cpu.png
The horrible green line is 7.0 for reference. The blue line is the same
16core machine with half of the cores disabled.
Thanks,
Jeff
Sponsored by: Nokia
Testing by: kris
Revision Changes Path
1.226 +443 -501 src/sys/kern/sched_ule.c
_______________________________________________
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"