Ingo Molnar wrote: > * Linus Torvalds <[EMAIL PROTECTED]> wrote: > > On Tue, 28 Aug 2007, Al Boldi wrote: > > > I like your analysis, but how do you explain that these stalls > > > vanish when __update_curr is disabled? > > > > It's entirely possible that what happens is that the X scheduling is > > just a slightly unstable system - which effectively would turn a small > > scheduling difference into a *huge* visible difference. > > i think it's because disabling __update_curr() in essence removes the > ability of scheduler to preempt tasks - that hack in essence results in > a non-scheduler. Hence the gears + X pair of tasks becomes a synchronous > pair of tasks in essence - and thus gears cannot "overload" X.
I have narrowed it down a bit to add_wait_runtime. Patch 2.6.22.5-v20.4 like this: 346- * the two values are equal) 347- * [Note: delta_mine - delta_exec is negative]: 348- */ 349:// add_wait_runtime(cfs_rq, curr, delta_mine - delta_exec); 350-} 351- 352-static void update_curr(struct cfs_rq *cfs_rq) When disabling add_wait_runtime the stalls are gone. With this change the scheduler is still usable, but it does not constitute a fix. Now, even with this hack, uneven nice-levels between X and gears causes a return of the stalls, so make sure both X and gears run on the same nice-level when testing. Again, the whole point of this workload is to expose scheduler glitches regardless of whether X is broken or not, and my hunch is that this problem looks suspiciously like an ia-boosting bug. What's important to note is that by adjusting the scheduler we can effect a correction in behaviour, and as such should yield this problem as fixable. It's probably a good idea to look further into add_wait_runtime. Thanks! -- Al - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/