Re: [RFC 1/3] powerpc: __copy_tofrom_user tweaked for Cell

2008-06-27 Thread Gunnar von Boehn
Bergmann <[EMAIL PROTECTED]> Subject Re: [RFC 1/3] powerpc:

Re: [RFC 1/3] powerpc: __copy_tofrom_user tweaked for Cell

2008-06-23 Thread Paul Mackerras
Gunnar von Boehn writes: > Interesting points. > Can you help me to understand where the negative effect of DCBZ does come > from? In my experience, dcbz slows down the hot-cache case because it adds a few cycles to the execution time of the inner loop, and on most 64-bit PowerPC implementations,

Re: [RFC 1/3] powerpc: __copy_tofrom_user tweaked for Cell

2008-06-23 Thread Geert Uytterhoeven
On Mon, 23 Jun 2008, Gunnar von Boehn wrote: > > The problem is that the dcbz will generate the alignment exception > > regardless of whether the data is actually unaligned or not. > > Once you're on that code path, performance can't be good, can it? > > In which case will DCBZ create an aligned e

Re: [RFC 1/3] powerpc: __copy_tofrom_user tweaked for Cell

2008-06-23 Thread Gunnar von Boehn
Subject Re: [RFC 1/3] powerpc:

Re: [Cbe-oss-dev] [RFC 1/3] powerpc: __copy_tofrom_user tweaked for Cell

2008-06-21 Thread Arnd Bergmann
On Saturday 21 June 2008, Paul Mackerras wrote: > Is this application really transferring bulk data and using buffers > that aren't a multiple of the page size?  Do you know whether the > copies ended up being misaligned? In the problem case that was reported to me, it was all bulk data, and all t

Re: [Cbe-oss-dev] [RFC 1/3] powerpc: __copy_tofrom_user tweaked for Cell

2008-06-20 Thread David Miller
From: Paul Mackerras <[EMAIL PROTECTED]> Date: Sat, 21 Jun 2008 14:30:02 +1000 > Is this application really transferring bulk data and using buffers > that aren't a multiple of the page size? Do you know whether the > copies ended up being misaligned? We used to cache align the sub-buffers carve

Re: [Cbe-oss-dev] [RFC 1/3] powerpc: __copy_tofrom_user tweaked for Cell

2008-06-20 Thread Paul Mackerras
Arnd Bergmann writes: > On Friday 20 June 2008, Paul Mackerras wrote: > > > Transferring data over loopback is possibly an exception to that. > > However, it's very rare to transfer large amounts of data over > > loopback, unless you're running a benchmark like iperf or netperf. :-/ > > Well, it

Re: [Cbe-oss-dev] [RFC 1/3] powerpc: __copy_tofrom_user tweaked for Cell

2008-06-20 Thread Arnd Bergmann
On Friday 20 June 2008, Paul Mackerras wrote: > Transferring data over loopback is possibly an exception to that. > However, it's very rare to transfer large amounts of data over > loopback, unless you're running a benchmark like iperf or netperf. :-/ Well, it is the exact case that came up in a

Re: [RFC 1/3] powerpc: __copy_tofrom_user tweaked for Cell

2008-06-20 Thread Sanjay Patel
--- On Fri, 6/20/08, Benjamin Herrenschmidt <[EMAIL PROTECTED]> wrote: > I though OS X had a trick with a CR bit that would disable > the dcbz optimization on the first alignment fault ? Or did they > totally remove it ? Ah, it's coming back to me. :) Apple added 'dcbz', removed it, and then t

Re: [RFC 1/3] powerpc: __copy_tofrom_user tweaked for Cell

2008-06-20 Thread Benjamin Herrenschmidt
On Fri, 2008-06-20 at 10:46 -0700, Sanjay Patel wrote: > --- On Fri, 6/20/08, Gunnar von Boehn <[EMAIL PROTECTED]> wrote: > > How important is best performance for the unaligned copy > > to/from uncacheable memory? > > The challenge of the CELL chip is that X-form of the shift > > instructions are

Re: [RFC 1/3] powerpc: __copy_tofrom_user tweaked for Cell

2008-06-20 Thread Sanjay Patel
--- On Fri, 6/20/08, Gunnar von Boehn <[EMAIL PROTECTED]> wrote: > How important is best performance for the unaligned copy > to/from uncacheable memory? > The challenge of the CELL chip is that X-form of the shift > instructions are microcoded. > The shifts are needed to implement a copy that rea

Re: [Cbe-oss-dev] [RFC 1/3] powerpc: __copy_tofrom_user tweaked for Cell

2008-06-20 Thread Gunnar von Boehn
Re: [Cbe-oss-dev] [RFC 1/3] powerpc: __copy_tofrom_user tweaked

Re: [RFC 1/3] powerpc: __copy_tofrom_user tweaked for Cell

2008-06-20 Thread Gunnar von Boehn
man <[EMAIL PROTECTED]>, [EMAIL PROTECTED] Subject Re: [RFC 1/3] powerpc: __copy_t

Re: [RFC 1/3] powerpc: __copy_tofrom_user tweaked for Cell

2008-06-19 Thread Mark Nelson
> * The naming of the labels (with just numbers) is rather confusing, > it would be good to have something better, but I must admit that > I don't have a good idea either. I will admit that at first glance the label naming with numbers does look confusing but when you notice that all the loads sta

Re: [Cbe-oss-dev] [RFC 1/3] powerpc: __copy_tofrom_user tweaked for Cell

2008-06-19 Thread Paul Mackerras
Gunnar von Boehn writes: > The "regular" code was much slower for the normal case and has a special > version for the 4K optimized case. That's a slightly inaccurate view... The reason for having the two cases is that when I profiled the distribution of sizes and alignments of memory copies in t

Re: [RFC 1/3] powerpc: __copy_tofrom_user tweaked for Cell

2008-06-19 Thread Sanjay Patel
--- On Thu, 6/19/08, Gunnar von Boehn <[EMAIL PROTECTED]> wrote: > You are right the main copy2user requires that the SRC is > cacheable. > IMHO because of the exception on load, the routine should > fallback to the > byte copy loop. > > Arnd, could you verify that it works on localstore? Sin

Re: [RFC 1/3] powerpc: __copy_tofrom_user tweaked for Cell

2008-06-19 Thread Gunnar von Boehn
Michael Ellerman <[EMAIL PROTECTED]> Subject Re: [RFC 1/3] powerpc:

Re: [RFC 1/3] powerpc: __copy_tofrom_user tweaked for Cell

2008-06-19 Thread Arnd Bergmann
On Thursday 19 June 2008, Mark Nelson wrote: > * __copy_tofrom_user routine optimized for CELL-BE-PPC A few things I noticed: * You don't have a page wise user copy, which the regular code has. This is probably not so noticable in iperf, but should have a significant impact on lmbench and on a

[RFC 1/3] powerpc: __copy_tofrom_user tweaked for Cell

2008-06-19 Thread Mark Nelson
/* * Copyright (C) 2008 Gunnar von Boehn, IBM Corp. * * This program is free software; you can redistribute it and/or * modify it under the terms of the GNU General Public License * as published by the Free Software Foundation; either version * 2 of the License, or (at your option) any later