On Tue, Nov 27, 2007 at 10:21:12AM +0100, Dmitry Adamushko wrote:
> On 26/11/2007, Micah Dowty <[EMAIL PROTECTED]> wrote:
> >
> > The application doesn't really depend on the load-balancer's decisions
> > per se, it just happens that this behaviour I'm s
Dmitry,
Thank you for the detailed explanation of the scheduler behaviour I've
been seeing.
On Thu, Nov 22, 2007 at 01:53:02PM +0100, Dmitry Adamushko wrote:
> > - Is there a good way to detect, without any kernel debug flags
> >set, whether the current machine has any scheduling domains
> >
On Tue, Nov 20, 2007 at 10:47:52PM +0100, Dmitry Adamushko wrote:
> btw., what's your system? If I recall right, SD_BALANCE_NEWIDLE is on
> by default for all configs, except for NUMA nodes.
It's a dual AMD64 Opteron.
So, I recompiled my 2.6.23.1 kernel without NUMA support, and with
your patch f
On Tue, Nov 20, 2007 at 06:57:55AM +0100, Ingo Molnar wrote:
>
> * Micah Dowty <[EMAIL PROTECTED]> wrote:
>
> > > this one is being triggered whenever a cpu becomes idle (schedule()
> > > --> idle_balance() --> load_balance_newidle()).
> > >
>
On Mon, Nov 19, 2007 at 11:22:06PM +0100, Dmitry Adamushko wrote:
> You seem to have a configuration with domains which don't have
> SD_BALANCE_NEWIDLE on (CONFIG_NUMA?) as there are no events (all
> zeros above) for CPU_NEWLY_IDLE.
>
> this one is being triggered whenever a cpu becomes idle (sch
On Sat, Nov 17, 2007 at 08:10:35PM +0100, Dmitry Adamushko wrote:
> Micah,
>
> ok, would it be possible to get "cat /proc/schedstat" output at the
> moment when you observe the 'problem'? So we could try to analyze
> behavior of the load balancer (yeah, we should have probably started
> with this
On Sat, Nov 17, 2007 at 12:26:41AM +0100, Dmitry Adamushko wrote:
> Let's say we change a pattern for the niced task: e.g. run for 100 ms.
> and then sleep for 300 ms. (that's ~25% of cpu load) in the loop. Any
> behavioral changes?
For consistency, I tested this using /dev/rtc. I set the rtc freq
On Fri, Nov 16, 2007 at 11:48:50AM +0100, Dmitry Adamushko wrote:
> could you try to change either :
>
> cat /proc/sys/kernel/sched_stat_granularity
>
> put it to the value equal to a tick on your system
This didn't seem to have any effect.
> or just remove bit #3 (which is responsible for 8 ==
On Sat, Nov 17, 2007 at 05:43:33AM +1030, David Newall wrote:
> There are a couple of points I would make about your python test harness.
> Your program compares real+system jiffies for both cpus; an ideal result
> would be 1.00. The measurement is taken over a relatively short period of
> app
On Fri, Nov 16, 2007 at 07:07:00AM +0100, Ingo Molnar wrote:
>
> * Micah Dowty <[EMAIL PROTECTED]> wrote:
>
> > > I am a bit at a loss as to how this could relate to the patch. This
> > > looks like a load balance logic issue that causes the load
> > >
On Fri, Nov 16, 2007 at 07:07:00AM +0100, Ingo Molnar wrote:
> > My best guess is that this has something to do with the timing with
> > which we sample the CPU's instantaneous load when calculating the load
> > averages.. but I still understand only the basics of the scheduler and
> > SMP balan
On Thu, Nov 15, 2007 at 06:31:49PM -0800, Christoph Lameter wrote:
> On Thu, 15 Nov 2007, Micah Dowty wrote:
>
> > On all kernels I've tested from after your patch was committed, I can
> > reproduce a problem where a single high-priority thread which wakes up
> > ver
On Thu, Nov 15, 2007 at 01:28:55PM -0800, Christoph Lameter wrote:
> On Thu, 15 Nov 2007, Micah Dowty wrote:
>
> > Yes, the Python test harness crashes, not the kernel. It's just
> > because on a kernel which exhibits this SMP balancer bug, within a
> > couple of te
On Thu, Nov 15, 2007 at 12:07:47PM -0800, Christoph Lameter wrote:
> On Thu, 15 Nov 2007, Micah Dowty wrote:
>
> > For reference, the exact test I used with git-bisect is attached. The
> > C program (priosched) starts two busy-looping threads and a
> > high-priority hig
sudo ./priosched
*
* Now observe the load on both CPUs. In the "good" state, both CPUs
* will be busy. In the "bad" state, both of the busyThreads will be
* stuck on the same CPU and the other CPU will be idle.
*
* If you have a kernel with scheduler debugging compiled in
On Fri, Nov 09, 2007 at 04:11:03PM -0800, Micah Dowty wrote:
> It's also possible this problem doesn't occur on 2.6.17. I have only
> tested this example on 2.6.23.1 and 2.6.20 so far.
I tested a handful of kernel versions, and it looks like this is
indeed the case. As far as
On Sat, Nov 10, 2007 at 12:56:07AM +0100, Cyrus Massoumi wrote:
> I tried your program on my machine (C2D, 2.6.17, O(1) scheduler).
>
> Both CPUs are 100% busy all the time. Each busy-looping thread is running
> on its own CPU. I've been watching top output for 10 minutes, the spreading
> is stab
s a behaviour any of the scheduler developers are aware of? I
would be very greatful if anyone could shed some light on the root
cause behind the inflated cpu_load average. If this turns out to be a
real bug, I would be happy to work on a patch.
Thanks in advance,
Micah Dowty
/*
* This is a demon
18 matches
Mail list logo