On Sun, 2010-10-03 at 14:13 -0500, Ayman El-Khashab wrote:
> On Sat, Sep 25, 2010 at 08:11:04AM +1000, Benjamin Herrenschmidt wrote:
> > On Fri, 2010-09-24 at 08:08 -0500, Ayman El-Khashab wrote:
> > >
> > > I suppose another option is to to use the kernel profiling option I
> > > always see but
On Sat, Sep 25, 2010 at 08:11:04AM +1000, Benjamin Herrenschmidt wrote:
> On Fri, 2010-09-24 at 08:08 -0500, Ayman El-Khashab wrote:
> >
> > I suppose another option is to to use the kernel profiling option I
> > always see but have never used. Is that a viable option to figure out
> > what is h
On Fri, 2010-09-24 at 08:08 -0500, Ayman El-Khashab wrote:
>
> I suppose another option is to to use the kernel profiling option I
> always see but have never used. Is that a viable option to figure out
> what is happening here?
With perf and stochastic sampling ? If you sample fast enough...
On Fri, Sep 24, 2010 at 06:30:34AM -0400, Josh Boyer wrote:
> On Fri, Sep 24, 2010 at 02:43:52PM +1000, Benjamin Herrenschmidt wrote:
> >> The DMA is what I use in the "real world case" to get data into and out
> >> of these buffers. However, I can disable the DMA completely and do only
> >> the
On Fri, Sep 24, 2010 at 02:43:52PM +1000, Benjamin Herrenschmidt wrote:
>> The DMA is what I use in the "real world case" to get data into and out
>> of these buffers. However, I can disable the DMA completely and do only
>> the kmalloc. In this case I still see the same poor performance. My
>>
> > No. The first pinned entry (0...256M) is inserted by the asm code in
> > head_44x.S. The code in 44x_mmu.c will later map the rest of lowmem
> > (typically up to 768M but various settings can change that) using more
> > 256M entries.
>
> Thanks Ben, appreciate all your wisdom and insight.
>
On Fri, Sep 24, 2010 at 11:07:24AM +1000, Benjamin Herrenschmidt wrote:
> On Thu, 2010-09-23 at 17:35 -0500, Ayman El-Khashab wrote:
> > Anything you allocate with kmalloc() is going to be mapped by bolted
> > > 256M TLB entries, so there should be no TLB misses happening in the
> > > kernel case.
On Thu, 2010-09-23 at 17:35 -0500, Ayman El-Khashab wrote:
> Anything you allocate with kmalloc() is going to be mapped by bolted
> > 256M TLB entries, so there should be no TLB misses happening in the
> > kernel case.
> >
>
> Hi Ben, can you or somebody elaborate? I saw the pinned tlb in
> 44x_
On Fri, Sep 24, 2010 at 08:01:04AM +1000, Benjamin Herrenschmidt wrote:
> On Thu, 2010-09-23 at 10:12 -0500, Ayman El-Khashab wrote:
> > I've implemented a working driver on my 460EX. it allocates a couple
> > of buffers of 4MB each. I have a custom memcmp algorithm in asm that
> > is extremely f
On Thu, 2010-09-23 at 10:12 -0500, Ayman El-Khashab wrote:
> I've implemented a working driver on my 460EX. it allocates a couple
> of buffers of 4MB each. I have a custom memcmp algorithm in asm that
> is extremely fast in user space, but 1/2 as fast when run on these
> buffers.
>
> my tests ar
I've implemented a working driver on my 460EX. it allocates a couple
of buffers of 4MB each. I have a custom memcmp algorithm in asm that
is extremely fast in user space, but 1/2 as fast when run on these
buffers.
my tests are showing that the algorithm seems to be memory bandwidth
bound. my gu
11 matches
Mail list logo