On 13 December 2012 15:53, Vincent Guittot <vincent.guit...@linaro.org> wrote: > On 13 December 2012 15:25, Alex Shi <alex....@intel.com> wrote: >> On 12/13/2012 06:11 PM, Vincent Guittot wrote: >>> On 13 December 2012 03:17, Alex Shi <alex....@intel.com> wrote: >>>> On 12/12/2012 09:31 PM, Vincent Guittot wrote: >>>>> During the creation of sched_domain, we define a pack buddy CPU for each >>>>> CPU >>>>> when one is available. We want to pack at all levels where a group of CPU >>>>> can >>>>> be power gated independently from others. >>>>> On a system that can't power gate a group of CPUs independently, the flag >>>>> is >>>>> set at all sched_domain level and the buddy is set to -1. This is the >>>>> default >>>>> behavior. >>>>> On a dual clusters / dual cores system which can power gate each core and >>>>> cluster independently, the buddy configuration will be : >>>>> >>>>> | Cluster 0 | Cluster 1 | >>>>> | CPU0 | CPU1 | CPU2 | CPU3 | >>>>> ----------------------------------- >>>>> buddy | CPU0 | CPU0 | CPU0 | CPU2 | >>>>> >>>>> Small tasks tend to slip out of the periodic load balance so the best >>>>> place >>>>> to choose to migrate them is during their wake up. The decision is in >>>>> O(1) as >>>>> we only check again one buddy CPU >>>> >>>> Just have a little worry about the scalability on a big machine, like on >>>> a 4 sockets NUMA machine * 8 cores * HT machine, the buddy cpu in whole >>>> system need care 64 LCPUs. and in your case cpu0 just care 4 LCPU. That >>>> is different on task distribution decision. >>> >>> The buddy CPU should probably not be the same for all 64 LCPU it >>> depends on where it's worth packing small tasks >> >> Do you have further ideas for buddy cpu on such example? > > yes, I have several ideas which were not really relevant for small > system but could be interesting for larger system > > We keep the same algorithm in a socket but we could either use another > LCPU in the targeted socket (conf0) or chain the socket (conf1) > instead of packing directly in one LCPU > > The scheme below tries to summaries the idea: > > Socket | socket 0 | socket 1 | socket 2 | socket 3 | > LCPU | 0 | 1-15 | 16 | 17-31 | 32 | 33-47 | 48 | 49-63 | > buddy conf0 | 0 | 0 | 1 | 16 | 2 | 32 | 3 | 48 | > buddy conf1 | 0 | 0 | 0 | 16 | 16 | 32 | 32 | 48 | > buddy conf2 | 0 | 0 | 16 | 16 | 32 | 32 | 48 | 48 | > > But, I don't know how this can interact with NUMA load balance and the > better might be to use conf3.
I mean conf2 not conf3 > >>> >>> Which kind of sched_domain configuration have you for such system ? >>> and how many sched_domain level have you ? >> >> it is general X86 domain configuration. with 4 levels, >> sibling/core/cpu/numa. >>> _______________________________________________ linaro-dev mailing list linaro-dev@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-dev