Bergmann <[EMAIL PROTECTED]>
Subject
Re: [RFC 1/3] powerpc:
Gunnar von Boehn writes:
> Interesting points.
> Can you help me to understand where the negative effect of DCBZ does come
> from?
In my experience, dcbz slows down the hot-cache case because it adds a
few cycles to the execution time of the inner loop, and on most 64-bit
PowerPC implementations,
On Mon, 23 Jun 2008, Gunnar von Boehn wrote:
> > The problem is that the dcbz will generate the alignment exception
> > regardless of whether the data is actually unaligned or not.
> > Once you're on that code path, performance can't be good, can it?
>
> In which case will DCBZ create an aligned e
Subject
Re: [RFC 1/3] powerpc:
On Saturday 21 June 2008, Paul Mackerras wrote:
> Is this application really transferring bulk data and using buffers
> that aren't a multiple of the page size? Do you know whether the
> copies ended up being misaligned?
In the problem case that was reported to me, it was all bulk data,
and all t
From: Paul Mackerras <[EMAIL PROTECTED]>
Date: Sat, 21 Jun 2008 14:30:02 +1000
> Is this application really transferring bulk data and using buffers
> that aren't a multiple of the page size? Do you know whether the
> copies ended up being misaligned?
We used to cache align the sub-buffers carve
Arnd Bergmann writes:
> On Friday 20 June 2008, Paul Mackerras wrote:
>
> > Transferring data over loopback is possibly an exception to that.
> > However, it's very rare to transfer large amounts of data over
> > loopback, unless you're running a benchmark like iperf or netperf. :-/
>
> Well, it
On Friday 20 June 2008, Paul Mackerras wrote:
> Transferring data over loopback is possibly an exception to that.
> However, it's very rare to transfer large amounts of data over
> loopback, unless you're running a benchmark like iperf or netperf. :-/
Well, it is the exact case that came up in a
--- On Fri, 6/20/08, Benjamin Herrenschmidt <[EMAIL PROTECTED]> wrote:
> I though OS X had a trick with a CR bit that would disable
> the dcbz optimization on the first alignment fault ? Or did they
> totally remove it ?
Ah, it's coming back to me. :)
Apple added 'dcbz', removed it, and then t
On Fri, 2008-06-20 at 10:46 -0700, Sanjay Patel wrote:
> --- On Fri, 6/20/08, Gunnar von Boehn <[EMAIL PROTECTED]> wrote:
> > How important is best performance for the unaligned copy
> > to/from uncacheable memory?
> > The challenge of the CELL chip is that X-form of the shift
> > instructions are
--- On Fri, 6/20/08, Gunnar von Boehn <[EMAIL PROTECTED]> wrote:
> How important is best performance for the unaligned copy
> to/from uncacheable memory?
> The challenge of the CELL chip is that X-form of the shift
> instructions are microcoded.
> The shifts are needed to implement a copy that rea
Re: [Cbe-oss-dev] [RFC 1/3]
powerpc: __copy_tofrom_user tweaked
man <[EMAIL PROTECTED]>,
[EMAIL PROTECTED]
Subject
Re: [RFC 1/3] powerpc:
__copy_t
> * The naming of the labels (with just numbers) is rather confusing,
> it would be good to have something better, but I must admit that
> I don't have a good idea either.
I will admit that at first glance the label naming with numbers
does look confusing but when you notice that all the loads sta
Gunnar von Boehn writes:
> The "regular" code was much slower for the normal case and has a special
> version for the 4K optimized case.
That's a slightly inaccurate view...
The reason for having the two cases is that when I profiled the
distribution of sizes and alignments of memory copies in t
--- On Thu, 6/19/08, Gunnar von Boehn <[EMAIL PROTECTED]> wrote:
> You are right the main copy2user requires that the SRC is
> cacheable.
> IMHO because of the exception on load, the routine should
> fallback to the
> byte copy loop.
>
> Arnd, could you verify that it works on localstore?
Sin
Michael Ellerman
<[EMAIL PROTECTED]>
Subject
Re: [RFC 1/3] powerpc:
On Thursday 19 June 2008, Mark Nelson wrote:
> * __copy_tofrom_user routine optimized for CELL-BE-PPC
A few things I noticed:
* You don't have a page wise user copy, which the regular code
has. This is probably not so noticable in iperf, but should
have a significant impact on lmbench and on a
/*
* Copyright (C) 2008 Gunnar von Boehn, IBM Corp.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
* as published by the Free Software Foundation; either version
* 2 of the License, or (at your option) any later
19 matches
Mail list logo