Re: scheduler scalability - cgroups, cpusets and load-balancing

2008-01-29 Thread Steven Rostedt
On Tue, 29 Jan 2008, Gregory Haskins wrote: > > > > I would like to get our concepts clear, and terms consistent. That's > > important for those others who would try to understand this. > > Very good idea. Thanks for doing this! > Sorry for coming in so late, I've been banging my head on differ

Re: scheduler scalability - cgroups, cpusets and load-balancing

2008-01-29 Thread Gregory Haskins
>>> On Tue, Jan 29, 2008 at 4:02 PM, in message <[EMAIL PROTECTED]>, Paul Jackson <[EMAIL PROTECTED]> wrote: > Gregory wrote: >> > ... (1) turning off >> > sched_load_balance in any overlapping cpusets, including all >> > encompassing parent cpusets, (2) leaving sched_load_balance on in the >> >

Re: scheduler scalability - cgroups, cpusets and load-balancing

2008-01-29 Thread Gregory Haskins
>>> On Tue, Jan 29, 2008 at 3:56 PM, in message <[EMAIL PROTECTED]>, Paul Jackson <[EMAIL PROTECTED]> wrote: > Gregory wrote: >> By moving it into the root_domain structure, there is now an instance >> per (um, for lack of a better, more up to date word) "exclusive" >> cpuset. That way, dispara

Re: scheduler scalability - cgroups, cpusets and load-balancing

2008-01-29 Thread Paul Jackson
Gregory wrote: > > ... (1) turning off > > sched_load_balance in any overlapping cpusets, including all > > encompassing parent cpusets, (2) leaving sched_load_balance on in the > > RT cpuset itself, and ... > > Technically you only need (2). I run my 4-8 core development systems > in the single d

Re: scheduler scalability - cgroups, cpusets and load-balancing

2008-01-29 Thread Paul Jackson
Gregory wrote: > By moving it into the root_domain structure, there is now an instance > per (um, for lack of a better, more up to date word) "exclusive" > cpuset. That way, disparate cpusets will not bother each other with > overload notifications, etc. So the root_domain structure is meant to

Re: scheduler scalability - cgroups, cpusets and load-balancing

2008-01-29 Thread Gregory Haskins
>>> On Tue, Jan 29, 2008 at 2:04 PM, in message <[EMAIL PROTECTED]>, Paul Jackson <[EMAIL PROTECTED]> wrote: > Gregory wrote: >> IMHO it works well the way it is: The user selects the class for a >> particular task using sched_setscheduler(), and they select the cpuset >> (or inherit it) that de

Re: scheduler scalability - cgroups, cpusets and load-balancing

2008-01-29 Thread Gregory Haskins
>>> On Tue, Jan 29, 2008 at 2:37 PM, in message <[EMAIL PROTECTED]>, Paul Jackson <[EMAIL PROTECTED]> wrote: > Gregory wrote: >> > 1) What are 'per-domain' variables? >> >> s/per-domain/per-root-domain > > Oh dear - now I've got more questions, not fewer. > > 1) "variables" ... what variable

Re: scheduler scalability - cgroups, cpusets and load-balancing

2008-01-29 Thread Paul Jackson
Gregory wrote: > > 1) What are 'per-domain' variables? > > s/per-domain/per-root-domain Oh dear - now I've got more questions, not fewer. 1) "variables" ... what variables? 2) Is a 'root-domain' just the RT specific portion of a sched_domain, or is it something else? --

Re: scheduler scalability - cgroups, cpusets and load-balancing

2008-01-29 Thread Paul Jackson
Gregory wrote: > IMHO it works well the way it is: The user selects the class for a > particular task using sched_setscheduler(), and they select the cpuset > (or inherit it) that defines its execution scope. If that scope has > balancing enabled, the policy for the member classes is in effect.

Re: scheduler scalability - cgroups, cpusets and load-balancing

2008-01-29 Thread Gregory Haskins
>>> On Tue, Jan 29, 2008 at 11:51 AM, in message <[EMAIL PROTECTED]>, Paul Jackson <[EMAIL PROTECTED]> wrote: > Gregory wrote: >> This is correct. We have the balance policy polymorphically associated >> with each sched_class, and the CFS load-balancer and RT "load" (really, >> priority) balancer

Re: scheduler scalability - cgroups, cpusets and load-balancing

2008-01-29 Thread Paul Jackson
Gregory wrote: > This is correct. We have the balance policy polymorphically associated > with each sched_class, and the CFS load-balancer and RT "load" (really, > priority) balancer can coexist together at the same time and across > arbitrary #s of cores So ... we have the option of having all s

Re: scheduler scalability - cgroups, cpusets and load-balancing

2008-01-29 Thread Gregory Haskins
>>> On Tue, Jan 29, 2008 at 11:28 AM, in message <[EMAIL PROTECTED]>, Paul Jackson <[EMAIL PROTECTED]> wrote: > Gregory wrote: >> I am a bit confused as to why you disable load-balancing in the >> RT cpuset? It shouldn't be strictly necessary in order for the >> RT scheduler to do its job (

Re: scheduler scalability - cgroups, cpusets and load-balancing

2008-01-29 Thread Paul Jackson
Gregory wrote: > What about exclusive cpusets? Don't they create a > new sched-domain or did I misunderstand there? cpu_exclusive cpusets no longer determine sched domains. I just said more in this in an earlier reply. -- I won't rest till it's the best ... P

Re: scheduler scalability - cgroups, cpusets and load-balancing

2008-01-29 Thread Paul Jackson
Gregory wrote: > I am a bit confused as to why you disable load-balancing in the > RT cpuset? It shouldn't be strictly necessary in order for the > RT scheduler to do its job (unless I am misunderstanding what you > are trying to accomplish?). Do you do this because you *have* > to in o

Re: scheduler scalability - cgroups, cpusets and load-balancing

2008-01-29 Thread Gregory Haskins
>>> On Tue, Jan 29, 2008 at 7:12 AM, in message <[EMAIL PROTECTED]>, Paul Jackson <[EMAIL PROTECTED]> wrote: > Peter, replying to Paul: >> > 3) you turn off sched_load_balance in that realtime cpuset. >> >> Ah, I don't think 3 is needed. Quite to the contrary, there is quite a >> large body of

Re: scheduler scalability - cgroups, cpusets and load-balancing

2008-01-29 Thread Gregory Haskins
>>> On Tue, Jan 29, 2008 at 6:30 AM, in message <[EMAIL PROTECTED]>, Paul Jackson <[EMAIL PROTECTED]> wrote: > Peter wrote, in reply to Peter ;): >> > [ It looks to me it balances a group over the largest SD the current cpu >> > has access to, even though that might be larger than the SD associ

Re: scheduler scalability - cgroups, cpusets and load-balancing

2008-01-29 Thread Gregory Haskins
>>> On Tue, Jan 29, 2008 at 6:50 AM, in message <[EMAIL PROTECTED]>, Peter Zijlstra <[EMAIL PROTECTED]> wrote: > On Tue, 2008-01-29 at 05:30 -0600, Paul Jackson wrote: >> Peter wrote, in reply to Peter ;): >> > > [ It looks to me it balances a group over the largest SD the current cpu >> > > h

Re: scheduler scalability - cgroups, cpusets and load-balancing

2008-01-29 Thread Peter Zijlstra
On Tue, 2008-01-29 at 06:52 -0600, Paul Jackson wrote: > > Ok, I'll take a stab at understanding that code. > > See also the section: > > 1.7 What is sched_load_balance ? > > in Documentation/cpusets.txt. > > Good luck ;). It seems Gregory tricked us both: 57d885fea0da0e9541d7730a9e1dcf7

Re: scheduler scalability - cgroups, cpusets and load-balancing

2008-01-29 Thread Paul Jackson
> Ok, I'll take a stab at understanding that code. See also the section: 1.7 What is sched_load_balance ? in Documentation/cpusets.txt. Good luck ;). -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <[EMA

Re: scheduler scalability - cgroups, cpusets and load-balancing

2008-01-29 Thread Peter Zijlstra
On Tue, 2008-01-29 at 06:03 -0600, Paul Jackson wrote: > Paul, responding to Peter: > > > We now have a per-cpuset Boolean flag file called 'sched_load_balance'. > > > > SD_LOAD_BALANCE, right? > > No. SD_LOAD_BALANCE is some attribute of sched domains. > > The 'sched_load_balance' flag is an

Re: scheduler scalability - cgroups, cpusets and load-balancing

2008-01-29 Thread Paul Jackson
Peter, responding to Paul: > > I really doubt we'd want to have such systems triggering the hard RT > > scheduler on whatever CPUs were in the batch schedulers big cpuset > > that didn't happened to have an active job currently assigned to them. > > My turn to be confused.. > > If SD_LOAD_BALANCE

Re: scheduler scalability - cgroups, cpusets and load-balancing

2008-01-29 Thread Paul Jackson
vatsa wrote to Peter: > After reading your explanation in the other mail abt what you mean here, > I agree. Ah good - glad someone understood that. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson <[EMAIL PROT

Re: scheduler scalability - cgroups, cpusets and load-balancing

2008-01-29 Thread Peter Zijlstra
On Tue, 2008-01-29 at 05:53 -0600, Paul Jackson wrote: > Peter wrote; > > So, I don't think we need that, I think we can do with the single flag, > > we just need to find these disjoint sets and stick our rt-domain there. > > Ah - perhaps you don't need that flag - but my other cpuset users do ;

Re: scheduler scalability - cgroups, cpusets and load-balancing

2008-01-29 Thread Srivatsa Vaddagiri
On Tue, Jan 29, 2008 at 11:57:22AM +0100, Peter Zijlstra wrote: > On Tue, 2008-01-29 at 10:53 +0100, Peter Zijlstra wrote: > > > My thoughts were to make stronger use of disjoint cpu-sets. cgroups and > > cpusets are related, in that cpusets provide a property to a cgroup. > > However, load_balanc

Re: scheduler scalability - cgroups, cpusets and load-balancing

2008-01-29 Thread Paul Jackson
Peter, replying to Paul: > > 3) you turn off sched_load_balance in that realtime cpuset. > > Ah, I don't think 3 is needed. Quite to the contrary, there is quite a > large body of research work covering the scheduling of (hard and soft) > realtime tasks on multiple cpus. Well, the way it's coded

Re: scheduler scalability - cgroups, cpusets and load-balancing

2008-01-29 Thread Peter Zijlstra
On Tue, 2008-01-29 at 05:30 -0600, Paul Jackson wrote: > Peter wrote, in reply to Peter ;): > > > [ It looks to me it balances a group over the largest SD the current cpu > > > has access to, even though that might be larger than the SD associated > > > with the cpuset of that particular cgrou

Re: scheduler scalability - cgroups, cpusets and load-balancing

2008-01-29 Thread Paul Jackson
Paul, responding to Peter: > > We now have a per-cpuset Boolean flag file called 'sched_load_balance'. > > SD_LOAD_BALANCE, right? No. SD_LOAD_BALANCE is some attribute of sched domains. The 'sched_load_balance' flag is an attribute of cpusets. The mapping of cpusets to sched domains required

Re: scheduler scalability - cgroups, cpusets and load-balancing

2008-01-29 Thread Paul Jackson
Peter wrote; > So, I don't think we need that, I think we can do with the single flag, > we just need to find these disjoint sets and stick our rt-domain there. Ah - perhaps you don't need that flag - but my other cpuset users do ;). You see, there are two very different ways that 'sched_load_ba

Re: scheduler scalability - cgroups, cpusets and load-balancing

2008-01-29 Thread Paul Jackson
Paul, talking to himself: > At that point, sched domains are rebuilt, including providing a > sched domain that just contains the CPUs in that realtime cpuset, and > normal scheduler load balancing ceases on the CPUs in that realtime > cpuset. Oops - correction - at that point sched domains are re

Re: scheduler scalability - cgroups, cpusets and load-balancing

2008-01-29 Thread Peter Zijlstra
On Tue, 2008-01-29 at 05:13 -0600, Paul Jackson wrote: > Peter wrote: > > Thanks for the link. Yes I think your last suggestion of creating > > rt-domains ( http://lkml.org/lkml/2007/10/23/419 ) is a good one. > > We now have a per-cpuset Boolean flag file called 'sched_load_balance'. SD_LOAD_BA

Re: scheduler scalability - cgroups, cpusets and load-balancing

2008-01-29 Thread Paul Jackson
Peter wrote, in reply to Peter ;): > > [ It looks to me it balances a group over the largest SD the current cpu > > has access to, even though that might be larger than the SD associated > > with the cpuset of that particular cgroup. ] > > Hmm, with a bit more thought I think that does indeed

Re: scheduler scalability - cgroups, cpusets and load-balancing

2008-01-29 Thread Paul Jackson
Peter wrote: > Thanks for the link. Yes I think your last suggestion of creating > rt-domains ( http://lkml.org/lkml/2007/10/23/419 ) is a good one. We now have a per-cpuset Boolean flag file called 'sched_load_balance'. In the default case, this flag is set on, and the kernel does its usual load

Re: scheduler scalability - cgroups, cpusets and load-balancing

2008-01-29 Thread Peter Zijlstra
Here I go, talking to myself.. On Tue, 2008-01-29 at 10:53 +0100, Peter Zijlstra wrote: > My thoughts were to make stronger use of disjoint cpu-sets. cgroups and > cpusets are related, in that cpusets provide a property to a cgroup. > However, load_balance_monitor()'s interaction with sched doma

Re: scheduler scalability - cgroups, cpusets and load-balancing

2008-01-29 Thread Peter Zijlstra
On Tue, 2008-01-29 at 04:01 -0600, Paul Jackson wrote: > Peter wrote: > > Also the RT load-balance needs to become aware of such these sets, I > > think Paul J and Steven once talked about it, but can't quite remember > > where that ended > > See further the thread: > > http://lkml.org/lkml/20

Re: scheduler scalability - cgroups, cpusets and load-balancing

2008-01-29 Thread Paul Jackson
Peter wrote: > Also the RT load-balance needs to become aware of such these sets, I > think Paul J and Steven once talked about it, but can't quite remember > where that ended See further the thread: http://lkml.org/lkml/2007/10/22/400 (I don't remember where it ended up either; probably nowhe

scheduler scalability - cgroups, cpusets and load-balancing

2008-01-29 Thread Peter Zijlstra
Hi All, Some of the fancy new scheduler features such as the cgroup load balancer (load_balance_monitor) and the real-time load balancer are a bit of an scalability issue. They all seem to want a rather strong global bound to keep a global fairness (which is quite understandable). [ my own intere