From: Peter Zijlstra <[EMAIL PROTECTED]>
Date: Fri, 05 Oct 2007 22:32:00 +0200
> Focus on the slab allocator usage, instrument it, record a trace,
> generate a statistical model that matches, and write a small
> programm/kernel module that has the same allocation pattern. Then verify
> this statis
On Thu, 2007-10-04 at 17:02 -0400, Chuck Ebbert wrote:
> On 10/04/2007 04:55 PM, David Miller wrote:
> >
> > Anything, I do mean anything, can be simulated using small test
> > programs.
>
> How do you simulate reading 100TB of data spread across 3000 disks,
> selecting 10% of it using some crit
Patch 2/2
SLUB: Allow foreign objects on the per cpu object lists.
In order to free objects we need to touch the page struct of the page that the
object belongs to. If this occurs too frequently then we could generate a
bouncing
cacheline.
We do not want that to occur too frequently. We can av
On Fri, 5 Oct 2007, Jens Axboe wrote:
> It might not, it might. The point is trying to isolate the problem and
> making a simple test case that could be used to reproduce it, so that
> Christoph (or someone else) can easily fix it.
In case there is someone who wants to hack on it: Here is what I
On Fri, 5 Oct 2007, Matthew Wilcox wrote:
> I vaguely remembered something called orasim, so I went looking for it.
> I found http://oss.oracle.com/~wcoekaer/orasim/ which is dated from
> 2004, and I found http://oss.oracle.com/projects/orasimjobfiles/ which
> seems to be a stillborn project. Is
On Fri, Oct 05 2007, Andi Kleen wrote:
> Jens Axboe <[EMAIL PROTECTED]> writes:
> >
> > Writing a small test module to exercise slub/slab in various ways
> > (allocating from all cpus freeing from one, as described) should not be
> > too hard. Perhaps that would be enough to find this performance
On Fri, Oct 05 2007, Matthew Wilcox wrote:
> On Fri, Oct 05, 2007 at 08:48:53AM +0200, Jens Axboe wrote:
> > I'd like to second Davids emails here, this is a serious problem. Having
> > a reproducible test case lowers the barrier for getting the problem
> > fixed by orders of magnitude. It's the di
On Fri, Oct 05, 2007 at 08:48:53AM +0200, Jens Axboe wrote:
> I'd like to second Davids emails here, this is a serious problem. Having
> a reproducible test case lowers the barrier for getting the problem
> fixed by orders of magnitude. It's the difference between the problem
> getting fixed in a d
Jens Axboe <[EMAIL PROTECTED]> writes:
>
> Writing a small test module to exercise slub/slab in various ways
> (allocating from all cpus freeing from one, as described) should not be
> too hard. Perhaps that would be enough to find this performance
> discrepancy between slab and slub?
You could s
On Fri, Oct 05 2007, Pekka Enberg wrote:
> Hi,
>
> On 10/5/07, Jens Axboe <[EMAIL PROTECTED]> wrote:
> > I'd like to second Davids emails here, this is a serious problem. Having
> > a reproducible test case lowers the barrier for getting the problem
> > fixed by orders of magnitude. It's the diffe
Hi,
On 10/5/07, Jens Axboe <[EMAIL PROTECTED]> wrote:
> I'd like to second Davids emails here, this is a serious problem. Having
> a reproducible test case lowers the barrier for getting the problem
> fixed by orders of magnitude. It's the difference between the problem
> getting fixed in a day or
On Fri, Oct 05 2007, David Chinner wrote:
> On Thu, Oct 04, 2007 at 03:07:18PM -0700, David Miller wrote:
> > From: Chuck Ebbert <[EMAIL PROTECTED]> Date: Thu, 04 Oct 2007 17:47:48
> > -0400
> >
> > > On 10/04/2007 05:11 PM, David Miller wrote:
> > > > From: Chuck Ebbert <[EMAIL PROTECTED]> Date:
> On 10/04/2007 07:39 PM, David Schwartz wrote:
> > But this is just a preposterous position to put him in. If there's no
> > reproduceable test case, then why should he care that one
> > program he can't
> > even see works badly? If you care, you fix it.
> People have been trying for years to m
On Thu, 4 Oct 2007 19:43:58 -0700 (PDT)
Christoph Lameter <[EMAIL PROTECTED]> wrote:
> So there could still be page struct contention left if multiple
> processors frequently and simultaneously free to the same slab and
> that slab is not the per cpu slab of a cpu. That could be addressed
> by opt
I just spend some time looking at the functions that you see high in the
list. The trouble is that I have to speculate and that I have nothing to
verify my thoughts. If you could give me the hitlist for each of the
3 runs then this would help to check my thinking. I could be totally off
here.
On 10/04/2007 07:39 PM, David Schwartz wrote:
> But this is just a preposterous position to put him in. If there's no
> reproduceable test case, then why should he care that one program he can't
> even see works badly? If you care, you fix it.
>
People have been trying for years to make reproduci
David Miller wrote:
> Using an unpublishable benchmark, whose results even cannot be
> published, really stretches the limits of "reasonable" don't you
> think?
>
> This "SLUB isn't ready yet" bullshit is just a shamans dance which
> distracts attention away from the real problem, which is that a
On Thu, Oct 04, 2007 at 03:07:18PM -0700, David Miller wrote:
> From: Chuck Ebbert <[EMAIL PROTECTED]> Date: Thu, 04 Oct 2007 17:47:48
> -0400
>
> > On 10/04/2007 05:11 PM, David Miller wrote:
> > > From: Chuck Ebbert <[EMAIL PROTECTED]> Date: Thu, 04 Oct 2007 17:02:17
> > > -0400
> > >
> > >> Ho
From: Chuck Ebbert <[EMAIL PROTECTED]>
Date: Thu, 04 Oct 2007 17:47:48 -0400
> On 10/04/2007 05:11 PM, David Miller wrote:
> > From: Chuck Ebbert <[EMAIL PROTECTED]>
> > Date: Thu, 04 Oct 2007 17:02:17 -0400
> >
> >> How do you simulate reading 100TB of data spread across 3000 disks,
> >> selecti
On 10/04/2007 05:11 PM, David Miller wrote:
> From: Chuck Ebbert <[EMAIL PROTECTED]>
> Date: Thu, 04 Oct 2007 17:02:17 -0400
>
>> How do you simulate reading 100TB of data spread across 3000 disks,
>> selecting 10% of it using some criterion, then sorting and
>> summarizing the result?
>
> You re
On Thu, 4 Oct 2007, Matthew Wilcox wrote:
> Yet here we stand. Christoph is aggressively trying to get slab removed
> from the tree. There is a testcase which shows slub performing worse
> than slab. It's not my fault I can't publish it. And just because I
> can't publish it doesn't mean it do
From: Chuck Ebbert <[EMAIL PROTECTED]>
Date: Thu, 04 Oct 2007 17:02:17 -0400
> How do you simulate reading 100TB of data spread across 3000 disks,
> selecting 10% of it using some criterion, then sorting and
> summarizing the result?
You repeatedly read zeros from a smaller disk into the same amo
On 10/04/2007 04:55 PM, David Miller wrote:
>
> Anything, I do mean anything, can be simulated using small test
> programs.
How do you simulate reading 100TB of data spread across 3000 disks,
selecting 10% of it using some criterion, then sorting and summarizing
the result?
-
To unsubscribe from
From: Matthew Wilcox <[EMAIL PROTECTED]>
Date: Thu, 4 Oct 2007 14:58:12 -0600
> On Thu, Oct 04, 2007 at 01:48:34PM -0700, David Miller wrote:
> > There comes a point where it is the reporter's responsibility to help
> > the developer come up with a publishable test case the developer can
> > use t
On Thu, Oct 04, 2007 at 01:55:37PM -0700, David Miller wrote:
> Anything, I do mean anything, can be simulated using small test
> programs. Pointing at a big fancy machine with lots of storage
> and disk is a passive aggressive way to avoid the real issues,
> in that nobody is putting forth the ef
On Thu, Oct 04, 2007 at 01:48:34PM -0700, David Miller wrote:
> There comes a point where it is the reporter's responsibility to help
> the developer come up with a publishable test case the developer can
> use to work on fixing the problem and help ensure it stays fixed.
That's a lot of effort.
From: [EMAIL PROTECTED] (Matthew Wilcox)
Date: Thu, 4 Oct 2007 12:28:25 -0700
> On Thu, Oct 04, 2007 at 10:49:52AM -0700, Christoph Lameter wrote:
> > Finally: Is there some way that I can reproduce the tests on my machines?
>
> As usual for these kinds of setups ... take a two-CPU machine, 64GB
From: Arjan van de Ven <[EMAIL PROTECTED]>
Date: Thu, 4 Oct 2007 10:50:46 -0700
> Ok every time something says anything not 100% positive about SLUB you
> come back with "but it's fixed in the next patch set"... *every time*.
I think this is partly Christoph subconsciously venting his
frustration
On Thu, Oct 04, 2007 at 12:05:35PM -0700, Christoph Lameter wrote:
> > > Was the page allocator pass through patchset
> > > separately applied as I requested?
> >
> > I don't believe so. Suresh?
>
> If it was a git pull then the pass through was included and never taken
> out.
It was a git pu
On Thu, 4 Oct 2007, Matthew Wilcox wrote:
> We have three runs, all with 2.6.23-rc3 plus the patches that Suresh
> applied from 20070922. The first run is with slab. The second run is
> with SLUB and the third run is SLUB plus the tuning parameters you
> recommended.
There was quite a bit of co
On Thu, Oct 04, 2007 at 10:49:52AM -0700, Christoph Lameter wrote:
> I was not aware of that. Would it be possible for you to summarize all the
> test data that you have right now about SLUB vs. SLAB with the patches
> listed? Exactly what kernel version and what version of the per cpu
> patche
On Thu, 2007-10-04 at 10:50 -0700, Arjan van de Ven wrote:
> On Thu, 4 Oct 2007 10:38:15 -0700 (PDT)
> Christoph Lameter <[EMAIL PROTECTED]> wrote:
>
>
> > Yeah the fastpath vs. slow path is not the issue as Siddha and I
> > concluded earlier. Seems that we are mainly seeing cacheline bouncing
>
On Thu, 4 Oct 2007, Arjan van de Ven wrote:
> Ok every time something says anything not 100% positive about SLUB you
> come back with "but it's fixed in the next patch set"... *every time*.
All I ask that people test the fixes that have been out there for the
known issues. If there are remaining
On Thu, 4 Oct 2007 10:38:15 -0700 (PDT)
Christoph Lameter <[EMAIL PROTECTED]> wrote:
> Yeah the fastpath vs. slow path is not the issue as Siddha and I
> concluded earlier. Seems that we are mainly seeing cacheline bouncing
> due to two cpus accessing meta data in the same page struct. The
> patc
On Thu, 4 Oct 2007, Matthew Wilcox wrote:
> > Yeah the fastpath vs. slow path is not the issue as Siddha and I concluded
> > earlier. Seems that we are mainly seeing cacheline bouncing due to two
> > cpus accessing meta data in the same page struct. The patches in
> > MM that are scheduled to b
On Thu, Oct 04, 2007 at 10:38:15AM -0700, Christoph Lameter wrote:
> On Thu, 4 Oct 2007, Matthew Wilcox wrote:
>
> > So, on "a well-known OLTP benchmark which prohibits publishing absolute
> > numbers" and on an x86-64 system (I don't think exactly which model
> > is important), we're seeing *6.51
On Thu, 4 Oct 2007, Matthew Wilcox wrote:
> So, on "a well-known OLTP benchmark which prohibits publishing absolute
> numbers" and on an x86-64 system (I don't think exactly which model
> is important), we're seeing *6.51%* performance loss on slub vs slab.
> This is with a 2.6.23-rc3 kernel. Tun
37 matches
Mail list logo