DRM and/or X trouble (was Re: CFS review)

2007-08-31 Thread Rene Herman
On 08/31/2007 08:46 AM, Tilman Sauerbeck wrote: On 08/29/2007 09:56 PM, Rene Herman wrote: With X server 1.3, I'm getting consistent crashes with two glxgear instances running. So, if you're getting any output, it's better than my situation. Before people focuss on software rendering too muc

Re: CFS review

2007-08-30 Thread Tilman Sauerbeck
Rene Herman [2007-08-30 09:05]: > On 08/29/2007 09:56 PM, Rene Herman wrote: > > Realised the BUGs may mean the kernel DRM people could want to be in CC... > > > On 08/29/2007 05:57 PM, Keith Packard wrote: > > > >> With X server 1.3, I'm getting consistent crashes with two glxgear > >> instance

Re: CFS review

2007-08-30 Thread Rene Herman
pare.) I didn't compare -- it no doubt will. I know the title of this thread is "CFS review" but it turned into Keith Packard noticing glxgears being broken on recent-ish X.org. The start of the thread was about things being broken using _software_ rendering though, so I thought it m

Re: CFS review

2007-08-30 Thread Chuck Ebbert
On 08/29/2007 03:56 PM, Rene Herman wrote: > > Before people focuss on software rendering too much -- also with 1.3.0 (and > a Matrox Millenium G550 AGP, 32M) glxgears also works decidedly crummy > using > hardware rendering. While I can move the glxgears window itself, the actual > spinning wheel

Re: CFS review

2007-08-30 Thread Ingo Molnar
* Rene Herman <[EMAIL PROTECTED]> wrote: > Realised the BUGs may mean the kernel DRM people could want to be in CC... and note that the schedule() call in there is not part of the crash backtrace: > >Call Trace: > > [] drm_lock+0x255/0x2de > > [] mga_dma_buffers+0x0/0x2e3 > > [] drm_ioctl+0x14

Re: CFS review

2007-08-30 Thread Rene Herman
On 08/29/2007 09:56 PM, Rene Herman wrote: Realised the BUGs may mean the kernel DRM people could want to be in CC... On 08/29/2007 05:57 PM, Keith Packard wrote: With X server 1.3, I'm getting consistent crashes with two glxgear instances running. So, if you're getting any output, it's bette

Re: CFS review

2007-08-29 Thread Rene Herman
On 08/29/2007 05:57 PM, Keith Packard wrote: With X server 1.3, I'm getting consistent crashes with two glxgear instances running. So, if you're getting any output, it's better than my situation. Before people focuss on software rendering too much -- also with 1.3.0 (and a Matrox Millenium G55

Re: CFS review

2007-08-29 Thread Keith Packard
On Wed, 2007-08-29 at 10:04 +0200, Ingo Molnar wrote: > is that old enough to not have the smart X scheduler? The smart scheduler went into the server in like 2000. I don't think you've got any systems that old. XFree86 4.1 or 4.2, I can't remember which. > (probably > the GLX bug you mentioned

Re: CFS review

2007-08-29 Thread Bill Davidsen
Ingo Molnar wrote: * Bill Davidsen <[EMAIL PROTECTED]> wrote: There is another way to show the problem visually under X (vesa-driver), by starting 3 gears simultaneously, which after laying them out side-by-side need some settling time before smoothing out. Without __update_curr it's abso

Re: CFS review

2007-08-29 Thread Al Boldi
Ingo Molnar wrote: > * Keith Packard <[EMAIL PROTECTED]> wrote: > > Make sure the X server isn't running with the smart scheduler > > disabled; that will cause precisely the symptoms you're seeing here. > > In the normal usptream sources, you'd have to use '-dumbSched' as an X > > server command li

Re: CFS review

2007-08-29 Thread Ingo Molnar
* Keith Packard <[EMAIL PROTECTED]> wrote: > Make sure the X server isn't running with the smart scheduler > disabled; that will cause precisely the symptoms you're seeing here. > In the normal usptream sources, you'd have to use '-dumbSched' as an X > server command line option. > > The old

Re: CFS review

2007-08-29 Thread Keith Packard
On Wed, 2007-08-29 at 06:46 +0200, Ingo Molnar wrote: > ok, i finally managed to reproduce the "artifact" myself on an older > box. It goes like this: start up X with the vesa driver (or with NoDRI) > to force software rendering. Then start up a couple of glxgears > instances. Those glxgears in

Re: CFS review

2007-08-28 Thread Ingo Molnar
* Al Boldi <[EMAIL PROTECTED]> wrote: > > se.sleep_max : 2194711437 > > se.block_max : 0 > > se.exec_max : 977446 > > se.wait_max : 1912321 > > > > the scheduler itself had a worst-case sched

Re: CFS review

2007-08-28 Thread Al Boldi
Ingo Molnar wrote: > * Al Boldi <[EMAIL PROTECTED]> wrote: > > I have narrowed it down a bit to add_wait_runtime. > > the scheduler is a red herring here. Could you "strace -ttt -TTT" one of > the glxgears instances (and send us the cfs-debug-info.sh output, with > CONFIG_SCHED_DEBUG=y and CONFIG_S

Re: CFS review

2007-08-28 Thread Ingo Molnar
* Al Boldi <[EMAIL PROTECTED]> wrote: > I have narrowed it down a bit to add_wait_runtime. the scheduler is a red herring here. Could you "strace -ttt -TTT" one of the glxgears instances (and send us the cfs-debug-info.sh output, with CONFIG_SCHED_DEBUG=y and CONFIG_SCHEDSTATS=y as requested b

Re: CFS review

2007-08-28 Thread Mike Galbraith
On Wed, 2007-08-29 at 06:18 +0200, Ingo Molnar wrote: > * Al Boldi <[EMAIL PROTECTED]> wrote: > > > No need for framebuffer. All you need is X using the X.org > > vesa-driver. Then start gears like this: > > > > # gears & gears & gears & > > > > Then lay them out side by side to see the per

Re: CFS review

2007-08-28 Thread Keith Packard
On Wed, 2007-08-29 at 06:18 +0200, Ingo Molnar wrote: > > Then lay them out side by side to see the periodic stallings for > > ~10sec. The X scheduling code isn't really designed to handle software GL well; the requests can be very expensive to execute, and yet are specified as atomic operations

Re: CFS review

2007-08-28 Thread Al Boldi
Ingo Molnar wrote: > * Linus Torvalds <[EMAIL PROTECTED]> wrote: > > On Tue, 28 Aug 2007, Al Boldi wrote: > > > I like your analysis, but how do you explain that these stalls > > > vanish when __update_curr is disabled? > > > > It's entirely possible that what happens is that the X scheduling is >

Re: CFS review

2007-08-28 Thread Ingo Molnar
* Al Boldi <[EMAIL PROTECTED]> wrote: > No need for framebuffer. All you need is X using the X.org > vesa-driver. Then start gears like this: > > # gears & gears & gears & > > Then lay them out side by side to see the periodic stallings for > ~10sec. i just tried something similar (by ad

Re: CFS review

2007-08-28 Thread Ingo Molnar
* Bill Davidsen <[EMAIL PROTECTED]> wrote: > > There is another way to show the problem visually under X > > (vesa-driver), by starting 3 gears simultaneously, which after > > laying them out side-by-side need some settling time before > > smoothing out. Without __update_curr it's absolutely

Re: CFS review

2007-08-28 Thread Bill Davidsen
Ingo Molnar wrote: * Al Boldi <[EMAIL PROTECTED]> wrote: ok. I think i might finally have found the bug causing this. Could you try the fix below, does your webserver thread-startup test work any better? It seems to help somewhat, but the problem is still visible. Even v20.3 on 2.6.22.5 didn

Re: CFS review

2007-08-28 Thread Bill Davidsen
Al Boldi wrote: Ingo Molnar wrote: * Al Boldi <[EMAIL PROTECTED]> wrote: The problem is that consecutive runs don't give consistent results and sometimes stalls. You may want to try that. well, there's a natural saturation point after a few hundred tasks (depending on your CPU's speed), at wh

Re: CFS review

2007-08-28 Thread Valdis . Kletnieks
On Mon, 27 Aug 2007 22:05:37 PDT, Linus Torvalds said: > > > On Tue, 28 Aug 2007, Al Boldi wrote: > > > > No need for framebuffer. All you need is X using the X.org vesa-driver. > > Then start gears like this: > > > > # gears & gears & gears & > > > > Then lay them out side by side to see

Re: CFS review

2007-08-28 Thread Ingo Molnar
* Willy Tarreau <[EMAIL PROTECTED]> wrote: > On Tue, Aug 28, 2007 at 10:02:18AM +0200, Ingo Molnar wrote: > > > > * Xavier Bestel <[EMAIL PROTECTED]> wrote: > > > > > Are you sure they are stalled ? What you may have is simple gears > > > running at a multiple of your screen refresh rate, so t

Re: CFS review

2007-08-28 Thread Willy Tarreau
On Tue, Aug 28, 2007 at 10:02:18AM +0200, Ingo Molnar wrote: > > * Xavier Bestel <[EMAIL PROTECTED]> wrote: > > > Are you sure they are stalled ? What you may have is simple gears > > running at a multiple of your screen refresh rate, so they only appear > > stalled. > > > > Plus, as said Linu

Re: CFS review

2007-08-28 Thread Ingo Molnar
* Linus Torvalds <[EMAIL PROTECTED]> wrote: > On Tue, 28 Aug 2007, Al Boldi wrote: > > > > I like your analysis, but how do you explain that these stalls > > vanish when __update_curr is disabled? > > It's entirely possible that what happens is that the X scheduling is > just a slightly unsta

Re: CFS review

2007-08-28 Thread Arjan van de Ven
On Tue, 28 Aug 2007 09:34:03 -0700 (PDT) Linus Torvalds <[EMAIL PROTECTED]> wrote: > > > On Tue, 28 Aug 2007, Al Boldi wrote: > > > > I like your analysis, but how do you explain that these stalls > > vanish when __update_curr is disabled? > > It's entirely possible that what happens is that t

Re: CFS review

2007-08-28 Thread Linus Torvalds
On Tue, 28 Aug 2007, Al Boldi wrote: > > I like your analysis, but how do you explain that these stalls vanish when > __update_curr is disabled? It's entirely possible that what happens is that the X scheduling is just a slightly unstable system - which effectively would turn a small schedul

Re: CFS review

2007-08-28 Thread Ingo Molnar
* Xavier Bestel <[EMAIL PROTECTED]> wrote: > Are you sure they are stalled ? What you may have is simple gears > running at a multiple of your screen refresh rate, so they only appear > stalled. > > Plus, as said Linus, you're not really testing the kernel scheduler. > gears is really bad ben

Re: CFS review

2007-08-28 Thread Xavier Bestel
On Tue, 2007-08-28 at 07:37 +0300, Al Boldi wrote: > start gears like this: > > # gears & gears & gears & > > Then lay them out side by side to see the periodic stallings for > ~10sec. Are you sure they are stalled ? What you may have is simple gears running at a multiple of your screen refres

Re: CFS review

2007-08-28 Thread Ingo Molnar
* Mike Galbraith <[EMAIL PROTECTED]> wrote: > > I like your analysis, but how do you explain that these stalls > > vanish when __update_curr is disabled? > > When you disable __update_curr(), you're utterly destroying the > scheduler. There may well be a scheduler connection, but disabling > _

Re: CFS review

2007-08-28 Thread Mike Galbraith
On Tue, 2007-08-28 at 08:23 +0300, Al Boldi wrote: > Linus Torvalds wrote: > > On Tue, 28 Aug 2007, Al Boldi wrote: > > > No need for framebuffer. All you need is X using the X.org vesa-driver. > > > Then start gears like this: > > > > > > # gears & gears & gears & > > > > > > Then lay them out

Re: CFS review

2007-08-27 Thread Al Boldi
Linus Torvalds wrote: > On Tue, 28 Aug 2007, Al Boldi wrote: > > No need for framebuffer. All you need is X using the X.org vesa-driver. > > Then start gears like this: > > > > # gears & gears & gears & > > > > Then lay them out side by side to see the periodic stallings for ~10sec. > > I don't

Re: CFS review

2007-08-27 Thread Linus Torvalds
On Tue, 28 Aug 2007, Al Boldi wrote: > > No need for framebuffer. All you need is X using the X.org vesa-driver. > Then start gears like this: > > # gears & gears & gears & > > Then lay them out side by side to see the periodic stallings for ~10sec. I don't think this is a good test. Wh

Re: CFS review

2007-08-27 Thread Al Boldi
Ingo Molnar wrote: > * Al Boldi <[EMAIL PROTECTED]> wrote: > > > Could you try the patch below instead, does this make 3x glxgears > > > smooth again? (if yes, could you send me your Signed-off-by line as > > > well.) > > > > The task-startup stalling is still there for ~10sec. > > > > Can you see

Re: CFS review

2007-08-27 Thread Ingo Molnar
* Al Boldi <[EMAIL PROTECTED]> wrote: > > Could you try the patch below instead, does this make 3x glxgears > > smooth again? (if yes, could you send me your Signed-off-by line as > > well.) > > The task-startup stalling is still there for ~10sec. > > Can you see the problem on your machine?

Re: CFS review

2007-08-27 Thread Al Boldi
Ingo Molnar wrote: > * Al Boldi <[EMAIL PROTECTED]> wrote: > > > could you send the exact patch that shows what you did? > > > > On 2.6.22.5-v20.3 (not v20.4): > > > > 340-curr->delta_exec += delta_exec; > > 341- > > 342-if (unlikely(curr->delta_exec > sysctl_sched_stat_granularity)) > > {

Re: CFS review

2007-08-27 Thread Ingo Molnar
* Al Boldi <[EMAIL PROTECTED]> wrote: > > could you send the exact patch that shows what you did? > > On 2.6.22.5-v20.3 (not v20.4): > > 340-curr->delta_exec += delta_exec; > 341- > 342-if (unlikely(curr->delta_exec > sysctl_sched_stat_granularity)) { > 343:// __update_curr(cfs

Re: CFS review

2007-08-26 Thread Al Boldi
Ingo Molnar wrote: > * Al Boldi <[EMAIL PROTECTED]> wrote: > > > and could you also check 20.4 on 2.6.22.5 perhaps, or very latest > > > -git? (Peter has experienced smaller spikes with that.) > > > > Ok, I tried all your suggestions, but nothing works as smooth as > > removing __update_curr. > > c

Re: CFS review

2007-08-26 Thread Ingo Molnar
* Al Boldi <[EMAIL PROTECTED]> wrote: > > and could you also check 20.4 on 2.6.22.5 perhaps, or very latest > > -git? (Peter has experienced smaller spikes with that.) > > Ok, I tried all your suggestions, but nothing works as smooth as > removing __update_curr. could you send the exact patch

Re: CFS review

2007-08-26 Thread Al Boldi
Ingo Molnar wrote: > * Al Boldi <[EMAIL PROTECTED]> wrote: > > > ok. I think i might finally have found the bug causing this. Could > > > you try the fix below, does your webserver thread-startup test work > > > any better? > > > > It seems to help somewhat, but the problem is still visible. Even

Re: CFS review

2007-08-25 Thread Ingo Molnar
* Al Boldi <[EMAIL PROTECTED]> wrote: > > ok. I think i might finally have found the bug causing this. Could > > you try the fix below, does your webserver thread-startup test work > > any better? > > It seems to help somewhat, but the problem is still visible. Even > v20.3 on 2.6.22.5 didn'

Re: CFS review

2007-08-25 Thread Al Boldi
Ingo Molnar wrote: > * Al Boldi <[EMAIL PROTECTED]> wrote: > > > > The problem is that consecutive runs don't give consistent results > > > > and sometimes stalls. You may want to try that. > > > > > > well, there's a natural saturation point after a few hundred tasks > > > (depending on your CPU'

Re: CFS review

2007-08-24 Thread Ingo Molnar
* Al Boldi <[EMAIL PROTECTED]> wrote: > > > The problem is that consecutive runs don't give consistent results > > > and sometimes stalls. You may want to try that. > > > > well, there's a natural saturation point after a few hundred tasks > > (depending on your CPU's speed), at which point th

Re: CFS review

2007-08-21 Thread Al Boldi
Ingo Molnar wrote: > * Al Boldi <[EMAIL PROTECTED]> wrote: > > There is one workload that still isn't performing well; it's a > > web-server workload that spawns 1K+ client procs. It can be emulated > > by using this: > > > > for i in `seq 1 to `; do ping 10.1 -A > /dev/null & done > > on ba

Re: CFS review

2007-08-21 Thread Roman Zippel
Hi, On Tue, 21 Aug 2007, Mike Galbraith wrote: > I thought this was history. With your config, I was finally able to > reproduce the anomaly (only with your proggy though), and Ingo's patch > does indeed fix it here. > > Freshly reproduced anomaly and patch verification, running 2.6.23-rc3 > wi

Re: CFS review

2007-08-21 Thread Ingo Molnar
* Al Boldi <[EMAIL PROTECTED]> wrote: > There is one workload that still isn't performing well; it's a > web-server workload that spawns 1K+ client procs. It can be emulated > by using this: > > for i in `seq 1 to `; do ping 10.1 -A > /dev/null & done on bash i did this as: for ((i=

Re: CFS review

2007-08-21 Thread Ingo Molnar
* Mike Galbraith <[EMAIL PROTECTED]> wrote: > > It doesn't make much of a difference. > > I thought this was history. With your config, I was finally able to > reproduce the anomaly (only with your proggy though), and Ingo's patch > does indeed fix it here. > > Freshly reproduced anomaly and

Re: CFS review

2007-08-21 Thread Mike Galbraith
On Tue, 2007-08-21 at 00:19 +0200, Roman Zippel wrote: > Hi, > > On Sat, 11 Aug 2007, Ingo Molnar wrote: > > > the only relevant thing that comes to mind at the moment is that last > > week Peter noticed a buggy aspect of sleeper bonuses (in that we do not > > rate-limit their output, hence we

Re: CFS review

2007-08-20 Thread Roman Zippel
Hi, On Sat, 11 Aug 2007, Ingo Molnar wrote: > the only relevant thing that comes to mind at the moment is that last > week Peter noticed a buggy aspect of sleeper bonuses (in that we do not > rate-limit their output, hence we 'waste' them instead of redistributing > them), and i've got the sma

Re: CFS review

2007-08-12 Thread Al Boldi
Ingo Molnar wrote: > * Al Boldi <[EMAIL PROTECTED]> wrote: > > The thing is, this unpredictability seems to exist even at nice level > > 0, but the smaller granularity covers it all up. It occasionally > > exhibits itself as hick-ups during transient heavy workload flux. But > > it's not easily r

Re: CFS review

2007-08-12 Thread Ingo Molnar
* Al Boldi <[EMAIL PROTECTED]> wrote: > > so could you please re-check chew jitter behavior with the latest > > kernel? (i've attached the standalone patch below, it will apply > > cleanly to rc2 too.) > > That fixes it, but by reducing granularity ctx is up 4-fold. ok, great! (the context-sw

Re: CFS review

2007-08-12 Thread Al Boldi
Ingo Molnar wrote: > * Al Boldi <[EMAIL PROTECTED]> wrote: > > That's because granularity increases when decreasing nice, and results > > in larger timeslices, which affects smoothness negatively. chew.c > > easily shows this problem with 2 background cpu-hogs at the same > > nice-level. > > > > p

Re: CFS review

2007-08-11 Thread Ingo Molnar
* Willy Tarreau <[EMAIL PROTECTED]> wrote: > > 1. Two simple busy loops, one of them is reniced to 15, according to > > my calculations the reniced task should get about 3.4% > > (1/(1.25^15+1)), but I get this: > > > > PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND > >

Re: CFS review

2007-08-11 Thread Ingo Molnar
* Al Boldi <[EMAIL PROTECTED]> wrote: > That's because granularity increases when decreasing nice, and results > in larger timeslices, which affects smoothness negatively. chew.c > easily shows this problem with 2 background cpu-hogs at the same > nice-level. > > pid 908, prio 0, out for

Re: CFS review

2007-08-11 Thread Al Boldi
Roman Zippel wrote: > On Fri, 10 Aug 2007, Ingo Molnar wrote: > > achieve that. It probably wont make a real difference, but it's really > > easy for you to send and it's still very useful when one tries to > > eliminate possibilities and when one wants to concentrate on the > > remaining possibili

Re: CFS review

2007-08-10 Thread Willy Tarreau
On Sat, Aug 11, 2007 at 12:50:08AM +0200, Roman Zippel wrote: > Hi, > > On Fri, 10 Aug 2007, Ingo Molnar wrote: > > > achieve that. It probably wont make a real difference, but it's really > > easy for you to send and it's still very useful when one tries to > > eliminate possibilities and when

Re: CFS review

2007-08-10 Thread Willy Tarreau
On Fri, Aug 10, 2007 at 11:15:55PM +0200, Roman Zippel wrote: > Hi, > > On Fri, 10 Aug 2007, Willy Tarreau wrote: > > > fortunately all bug reporters are not like you. It's amazing how long > > you can resist sending a simple bug report to a developer! > > I'm more amazed how long Ingo can resis

Re: CFS review

2007-08-10 Thread Ingo Molnar
* Ingo Molnar <[EMAIL PROTECTED]> wrote: > * Roman Zippel <[EMAIL PROTECTED]> wrote: > > > Well, I've sent him the stuff now... > > received it - thanks alot, looking at it! everything looks good in your debug output and the TSC dump data, except for the wait_runtime values, they are quite ou

Re: CFS review

2007-08-10 Thread Roman Zippel
Hi, On Fri, 10 Aug 2007, Ingo Molnar wrote: > achieve that. It probably wont make a real difference, but it's really > easy for you to send and it's still very useful when one tries to > eliminate possibilities and when one wants to concentrate on the > remaining possibilities alone. The thin

Re: CFS review

2007-08-10 Thread Ingo Molnar
* Roman Zippel <[EMAIL PROTECTED]> wrote: > Well, I've sent him the stuff now... received it - thanks alot, looking at it! > It's not like I haven't given him anything, he already has the test > programs, he already knows the system configuration. one more small thing: could you please send y

Re: CFS review

2007-08-10 Thread Roman Zippel
Hi, On Fri, 10 Aug 2007, Willy Tarreau wrote: > fortunately all bug reporters are not like you. It's amazing how long > you can resist sending a simple bug report to a developer! I'm more amazed how long Ingo can resist providing some explanations (not just about this problem). It's not like I

Re: CFS review

2007-08-10 Thread Willy Tarreau
On Fri, Aug 10, 2007 at 07:25:57PM +0200, Roman Zippel wrote: > Hi, > > On Fri, 10 Aug 2007, Michael Chang wrote: > > > On 8/10/07, Roman Zippel <[EMAIL PROTECTED]> wrote: > > > Is there any reason to believe my analysis is wrong? > > > > Not yet, but if you give Ingo what he wants (as opposed t

Re: CFS review

2007-08-10 Thread Ingo Molnar
* Roman Zippel <[EMAIL PROTECTED]> wrote: > > Not yet, but if you give Ingo what he wants (as opposed to what > > you're giving him) it'll be easier for him to answer what's going > > wrong, and perhaps "fix" the problem to boot. > > > > (The scripts gives info about CPU characteristics, inter

Re: CFS review

2007-08-10 Thread Roman Zippel
Hi, On Fri, 10 Aug 2007, Michael Chang wrote: > On 8/10/07, Roman Zippel <[EMAIL PROTECTED]> wrote: > > Is there any reason to believe my analysis is wrong? > > Not yet, but if you give Ingo what he wants (as opposed to what you're > giving him) it'll be easier for him to answer what's going wro

Re: CFS review

2007-08-10 Thread Roman Zippel
Hi, On Fri, 10 Aug 2007, Mike Galbraith wrote: > I guess I'm going to have to give up on trying to reproduce this... my > 3GHz P4 is just not getting there from here. Last attempt, compiled UP, > HZ=1000 dynticks, full preempt and highres timers fwiw. > > 6392 root 20 0 1696 332 248 R

Re: CFS review

2007-08-10 Thread Michael Chang
On 8/10/07, Roman Zippel <[EMAIL PROTECTED]> wrote: > Is there any reason to believe my analysis is wrong? Not yet, but if you give Ingo what he wants (as opposed to what you're giving him) it'll be easier for him to answer what's going wrong, and perhaps "fix" the problem to boot. (The scripts g

Re: CFS review

2007-08-10 Thread Mike Galbraith
I guess I'm going to have to give up on trying to reproduce this... my 3GHz P4 is just not getting there from here. Last attempt, compiled UP, HZ=1000 dynticks, full preempt and highres timers fwiw. 6392 root 20 0 1696 332 248 R 25.5 0.0 3:00.14 0 lt 6393 root 20 0 1696 332

Re: CFS review

2007-08-10 Thread Ingo Molnar
* Roman Zippel <[EMAIL PROTECTED]> wrote: > > Also, could you please send me > > the cfs-debug-info.sh: > > > > http://people.redhat.com/mingo/cfs-scheduler/tools/cfs-debug-info.sh > > > > captured _while_ the above workload is running. This is the third time > > i've asked for that :-) >

Re: CFS review

2007-08-10 Thread Roman Zippel
Hi, On Fri, 10 Aug 2007, Ingo Molnar wrote: > > I disabled the jiffies logic and the result is still the same, so this > > problem isn't related to resolution at all. > > how did you disable the jiffies logic? I commented it out. > Also, could you please send me > the cfs-debug-info.sh: > >

Re: CFS review

2007-08-10 Thread Mike Galbraith
On Fri, 2007-08-10 at 01:14 +0200, Roman Zippel wrote: > Hi, Greetings, > On Wed, 1 Aug 2007, Ingo Molnar wrote: > > > just to make sure, how does 'top' output of the l + "lt 3" testcase look > > like now on your laptop? Yesterday it was this: > > > > 4544 roman 20 0 1796 520 432 S 3

Re: CFS review

2007-08-09 Thread Ingo Molnar
* Roman Zippel <[EMAIL PROTECTED]> wrote: > > 4544 roman 20 0 1796 520 432 S 32.1 0.4 0:21.08 lt > > 4545 roman 20 0 1796 344 256 R 32.1 0.3 0:21.07 lt > > 4546 roman 20 0 1796 344 256 R 31.7 0.3 0:21.07 lt > > 4547 roman 20 0 1532 272 216 R 3.3

Re: CFS review

2007-08-09 Thread Roman Zippel
Hi, On Wed, 1 Aug 2007, Ingo Molnar wrote: > just to make sure, how does 'top' output of the l + "lt 3" testcase look > like now on your laptop? Yesterday it was this: > > 4544 roman 20 0 1796 520 432 S 32.1 0.4 0:21.08 lt > 4545 roman 20 0 1796 344 256 R 32.1 0.3 0:21

Re: CFS review

2007-08-03 Thread Ingo Molnar
* Matt Mackall <[EMAIL PROTECTED]> wrote: > > question is if it's significantly worse than before. With a 100 or > > 1000Hz timer, you can't expect perfect fairness just due to the > > extremely rough measurement of time spent... > > Indeed. I'm just pointing out that not having TSC, fast HZ,

Re: CFS review

2007-08-03 Thread Andi Kleen
Matt Mackall <[EMAIL PROTECTED]> writes: > > Indeed. I'm just pointing out that not having TSC, fast HZ, no-HZ > mode, or high-res timers should not be treated as an unusual > circumstance. That's a PC-centric view. The question is if it would be that hard to add TSC equivalent sched_clock() supp

Re: CFS review

2007-08-02 Thread Willy Tarreau
On Thu, Aug 02, 2007 at 09:31:19PM -0700, Arjan van de Ven wrote: > On Fri, 2007-08-03 at 06:18 +0200, Willy Tarreau wrote: > > On Thu, Aug 02, 2007 at 08:57:47PM -0700, Arjan van de Ven wrote: > > > On Thu, 2007-08-02 at 22:04 -0500, Matt Mackall wrote: > > > > On Wed, Aug 01, 2007 at 01:22:29PM +

Re: CFS review

2007-08-02 Thread Matt Mackall
On Thu, Aug 02, 2007 at 08:57:47PM -0700, Arjan van de Ven wrote: > On Thu, 2007-08-02 at 22:04 -0500, Matt Mackall wrote: > > On Wed, Aug 01, 2007 at 01:22:29PM +0200, Ingo Molnar wrote: > > > > > > * Roman Zippel <[EMAIL PROTECTED]> wrote: > > > > > > > [...] e.g. in this example there are thre

Re: CFS review

2007-08-02 Thread Arjan van de Ven
On Fri, 2007-08-03 at 06:18 +0200, Willy Tarreau wrote: > On Thu, Aug 02, 2007 at 08:57:47PM -0700, Arjan van de Ven wrote: > > On Thu, 2007-08-02 at 22:04 -0500, Matt Mackall wrote: > > > On Wed, Aug 01, 2007 at 01:22:29PM +0200, Ingo Molnar wrote: > > > > > > > > * Roman Zippel <[EMAIL PROTECTED

Re: CFS review

2007-08-02 Thread Willy Tarreau
On Thu, Aug 02, 2007 at 08:57:47PM -0700, Arjan van de Ven wrote: > On Thu, 2007-08-02 at 22:04 -0500, Matt Mackall wrote: > > On Wed, Aug 01, 2007 at 01:22:29PM +0200, Ingo Molnar wrote: > > > > > > * Roman Zippel <[EMAIL PROTECTED]> wrote: > > > > > > > [...] e.g. in this example there are thre

Re: CFS review

2007-08-02 Thread Arjan van de Ven
On Thu, 2007-08-02 at 22:04 -0500, Matt Mackall wrote: > On Wed, Aug 01, 2007 at 01:22:29PM +0200, Ingo Molnar wrote: > > > > * Roman Zippel <[EMAIL PROTECTED]> wrote: > > > > > [...] e.g. in this example there are three tasks that run only for > > > about 1ms every 3ms, but they get far more ti

Re: CFS review

2007-08-02 Thread Matt Mackall
On Wed, Aug 01, 2007 at 01:22:29PM +0200, Ingo Molnar wrote: > > * Roman Zippel <[EMAIL PROTECTED]> wrote: > > > [...] e.g. in this example there are three tasks that run only for > > about 1ms every 3ms, but they get far more time than should have > > gotten fairly: > > > > 4544 roman 20

Re: CFS review

2007-08-02 Thread Roman Zippel
Hi, On Wed, 1 Aug 2007, Linus Torvalds wrote: > So I think it would be entirely appropriate to > > - do something that *approximates* microseconds. > >Using microseconds instead of nanoseconds would likely allow us to do >32-bit arithmetic in more areas, without any real overflow. Th

Re: CFS review

2007-08-02 Thread Roman Zippel
Hi, On Thu, 2 Aug 2007, Ingo Molnar wrote: > Most importantly, CFS _already_ includes a number of measures that act > against too frequent math. So even though you can see 64-bit math code > in it, it's only rarely called if your clock has a low resolution - and > that happens all automaticall

Re: CFS review

2007-08-02 Thread Daniel Phillips
Hi Linus, On Wednesday 01 August 2007 19:17, Linus Torvalds wrote: >And the "approximates" thing would be about the fact that we don't >actually care about "absolute" microseconds as much as something > that is in the "roughly a microsecond" area. So if we say "it doesn't > have to be micr

Re: CFS review

2007-08-02 Thread Roman Zippel
Hi, On Wed, 1 Aug 2007, Peter Zijlstra wrote: > Took me most of today trying to figure out WTH you did in fs2.c, more > math and fundamental explanations would have been good. So please bear > with me as I try to recap this thing. (No, your code was very much _not_ > obvious, a few comments and b

Re: CFS review

2007-08-02 Thread Ingo Molnar
* Linus Torvalds <[EMAIL PROTECTED]> wrote: > It would be better, I suspect, to make the scheduler clock totally > distinct from the other clock sources (many architectures have per-cpu > cycle counters), and *not* try to even necessarily force it to be a > "time-based" one. yeah. Note that i

Re: CFS review

2007-08-02 Thread Ingo Molnar
* Roman Zippel <[EMAIL PROTECTED]> wrote: > [...] With the increased text comes increased runtime memory usage, > e.g. task_struct increased so that only 5 of them instead 6 fit now > into 8KB. yeah, thanks for the reminder, this is on my todo list. As i suspect you noticed it too, much of th

Re: CFS review

2007-08-02 Thread Willy Tarreau
On Thu, Aug 02, 2007 at 12:43:29PM +0200, Andi Kleen wrote: > > However, I undertand why Ingo chose to use 64 bits. It has the advantage > > that the numbers never wrap within 584 years. I'm well aware that it's > > very difficult to keep tasks ordered according to a key which can wrap. > > If you

Re: CFS review

2007-08-02 Thread Andi Kleen
Willy Tarreau <[EMAIL PROTECTED]> writes: >(I remember it could not play with registers renaming, etc...). This has changed in recent gccs. It doesn't force register pairs anymore. Given the code is still not that good, but some of the worst sins are gone > However, I undertand why Ingo chose to

Re: CFS review

2007-08-01 Thread Willy Tarreau
On Wed, Aug 01, 2007 at 07:17:51PM -0700, Linus Torvalds wrote: > > > On Wed, 1 Aug 2007, Roman Zippel wrote: > > > > I'm not so sure about that. sched_clock() has to be fast, so many archs > > may want to continue to use jiffies. As soon as one does that one can also > > save a lot of computa

Re: CFS review

2007-08-01 Thread Linus Torvalds
On Wed, 1 Aug 2007, Roman Zippel wrote: > > I'm not so sure about that. sched_clock() has to be fast, so many archs > may want to continue to use jiffies. As soon as one does that one can also > save a lot of computational overhead by using 32bit instead of 64bit. > The question is then how ea

Re: CFS review

2007-08-01 Thread Ingo Molnar
* Roman Zippel <[EMAIL PROTECTED]> wrote: > Hi, > > On Wed, 1 Aug 2007, Ingo Molnar wrote: > > > Andi's theory cannot be true either, Roman's debug info also shows this > > /proc//sched data: > > > > clock-delta : 95 > > > > that means that sched_clock() is in

Re: CFS review

2007-08-01 Thread Roman Zippel
Hi, On Wed, 1 Aug 2007, Ingo Molnar wrote: > > [...] I didn't say 'sleeper starvation' or 'rounding error', these are > > your words and it's your perception of what I said. > > Oh dear :-) It was indeed my preception that yesterday you said: *sigh* and here you go off again nitpicking on a mi

Re: CFS review

2007-08-01 Thread Roman Zippel
Hi, On Wed, 1 Aug 2007, Ingo Molnar wrote: > Andi's theory cannot be true either, Roman's debug info also shows this > /proc//sched data: > > clock-delta : 95 > > that means that sched_clock() is in high-res mode, the TSC is alive and > kicking and a sched_cloc

Re: CFS review

2007-08-01 Thread Ingo Molnar
* Linus Torvalds <[EMAIL PROTECTED]> wrote: > On Wed, 1 Aug 2007, Andi Kleen wrote: > > > Ingo Molnar <[EMAIL PROTECTED]> writes: > > > > > thanks. Just to make sure, while you said that your TSC was off on that > > > laptop, the bootup log of yours suggests a working TSC: > > > > > > Time:

Re: CFS review

2007-08-01 Thread Andi Kleen
> I assume that what Roman hit was that he had explicitly disabled the TSC > because of TSC instability with the "notsc" kernel command line. Which > disabled is *entirely*. It might just have been cpufreq. That nearly hits everybody with cpufreq unless you have a pstate invariant TSC; and that'

Re: CFS review

2007-08-01 Thread Ingo Molnar
* Roman Zippel <[EMAIL PROTECTED]> wrote: > > i tried your fl.c and if sched_clock() is high-resolution it's scheduled > > _perfectly_ by CFS: > > > >PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND > > 5906 mingo 20 0 1576 244 196 R 71.2 0.0 0:30.11 l > >

Re: CFS review

2007-08-01 Thread Linus Torvalds
On Wed, 1 Aug 2007, Andi Kleen wrote: > Ingo Molnar <[EMAIL PROTECTED]> writes: > > > thanks. Just to make sure, while you said that your TSC was off on that > > laptop, the bootup log of yours suggests a working TSC: > > > > Time: tsc clocksource has been installed. > > Standard kernels o

Re: CFS review

2007-08-01 Thread Andi Kleen
Ingo Molnar <[EMAIL PROTECTED]> writes: > thanks. Just to make sure, while you said that your TSC was off on that > laptop, the bootup log of yours suggests a working TSC: > > Time: tsc clocksource has been installed. Standard kernels often disable the TSC later after running a bit with it (

Re: CFS review

2007-08-01 Thread Andi Kleen
On Wed, Aug 01, 2007 at 04:36:24PM +0200, Ingo Molnar wrote: > > * Roman Zippel <[EMAIL PROTECTED]> wrote: > > > > jiffies based sched_clock should be soon very rare. It's probably > > > not worth optimizing for it. > > > > I'm not so sure about that. sched_clock() has to be fast, so many > >

  1   2   >