Re: [PATCH 1/3] powerpc: POWER7 optimised copy_page using VMX

2011-06-17 Thread Segher Boessenkool
+ addir1,r1,STACKFRAMESIZE + + .align 5 Do we know that the blank will be filled with something harmless ? Yes. See ppc_handle_align() in gas/config/tc-ppc.c : it fills with nops (ori 0,0,0), and a branch if there are more than four nops, and for POWER6 and POWER7 it puts a

Re: [PATCH 1/3] powerpc: POWER7 optimised copy_page using VMX

2011-06-16 Thread Benjamin Herrenschmidt
On Fri, 2011-06-17 at 14:53 +1000, Anton Blanchard wrote: > +#include > +#include > + > +#define STACKFRAMESIZE 112 > + > +_GLOBAL(copypage_power7) > + mflrr0 > + std r3,48(r1) > + std r4,56(r1) > + std r0,16(r1) > + stdur1,-STACKFRAMESIZE(r1) > + >

Re: [PATCH 1/3] powerpc: POWER7 optimised copy_page using VMX

2011-06-16 Thread Benjamin Herrenschmidt
On Fri, 2011-06-17 at 14:53 +1000, Anton Blanchard wrote: > plain text document attachment (power7_copypage) > Implement a POWER7 optimised copy_page using VMX. We copy a cacheline > at a time using VMX loads and stores. > > Signed-off-by: Anton Blanchard > --- > > How do we want to handle per m

Re: [PATCH 1/3] powerpc: POWER7 optimised copy_page using VMX

2011-06-16 Thread Anton Blanchard
Hi, > Yeah, I'm pretty against CPU_FTR_POWER7. Every loon is going to > attach anything POWER7 to it. > > I'm keen to see it setup in __setup_cpu_power7. Either a function > pointer or use the patch_instruction infrastructure to avoid indirect > function calls on small copies. Instruction

Re: [PATCH 1/3] powerpc: POWER7 optimised copy_page using VMX

2011-06-16 Thread Michael Neuling
> Implement a POWER7 optimised copy_page using VMX. We copy a cacheline > at a time using VMX loads and stores. > > Signed-off-by: Anton Blanchard > --- > > How do we want to handle per machine optimised functions? I create > yet another feature bit, but feature bits might get out of control > a