Re: [PATCH 1/2] powerpc: string: implement optimized memset variants

2017-04-17 Thread Michael Ellerman
Michael Ellerman writes: > "Naveen N. Rao" writes: >> (generic) is with Matt's arch-independent patches applied. Profiling >> indicates that most of the overhead is actually with the lzo >> decompression... >> >> Also, with a simple module to memset64() a 1GB vmalloc'ed buffer, here >> are th

Re: [PATCH 1/2] powerpc: string: implement optimized memset variants

2017-04-12 Thread Naveen N. Rao
Excerpts from PrasannaKumar Muralidharan's message of April 5, 2017 11:21: On 30 March 2017 at 12:46, Naveen N. Rao wrote: Also, with a simple module to memset64() a 1GB vmalloc'ed buffer, here are the results: generic:0.245315533 seconds time elapsed( +- 1.83% ) optimized:

Re: [PATCH 1/2] powerpc: string: implement optimized memset variants

2017-04-04 Thread PrasannaKumar Muralidharan
On 30 March 2017 at 12:46, Naveen N. Rao wrote: > Also, with a simple module to memset64() a 1GB vmalloc'ed buffer, here > are the results: > generic:0.245315533 seconds time elapsed( +- 1.83% ) > optimized: 0.169282701 seconds time elapsed( +- 1.96% ) Wondering wha

Re: [PATCH 1/2] powerpc: string: implement optimized memset variants

2017-04-04 Thread Michael Ellerman
"Naveen N. Rao" writes: > (generic) is with Matt's arch-independent patches applied. Profiling > indicates that most of the overhead is actually with the lzo > decompression... > > Also, with a simple module to memset64() a 1GB vmalloc'ed buffer, here > are the results: > generic: 0.245315

Re: [PATCH 1/2] powerpc: string: implement optimized memset variants

2017-03-30 Thread Naveen N. Rao
On 2017/03/29 10:36PM, Michael Ellerman wrote: > "Naveen N. Rao" writes: > > I also tested zram today with the command shared by Wilcox: > > > > without patch: 1.493782568 seconds time elapsed( +- 0.08% ) > > with patch: 1.408457577 seconds time elapsed( +- 0.15% ) > > > >

Re: [PATCH 1/2] powerpc: string: implement optimized memset variants

2017-03-29 Thread Michael Ellerman
"Naveen N. Rao" writes: > I also tested zram today with the command shared by Wilcox: > > without patch: 1.493782568 seconds time elapsed( +- 0.08% ) > with patch: 1.408457577 seconds time elapsed( +- 0.15% ) > > ... which also shows an improvement along the same lines as

Re: [PATCH 1/2] powerpc: string: implement optimized memset variants

2017-03-28 Thread Naveen N. Rao
On 2017/03/28 11:44AM, Michael Ellerman wrote: > "Naveen N. Rao" writes: > > > diff --git a/arch/powerpc/lib/mem_64.S b/arch/powerpc/lib/mem_64.S > > index 85fa9869aec5..ec531de6 100644 > > --- a/arch/powerpc/lib/mem_64.S > > +++ b/arch/powerpc/lib/mem_64.S > > @@ -13,6 +13,23 @@ > > #includ

Re: [PATCH 1/2] powerpc: string: implement optimized memset variants

2017-03-27 Thread Michael Ellerman
"Naveen N. Rao" writes: > diff --git a/arch/powerpc/lib/mem_64.S b/arch/powerpc/lib/mem_64.S > index 85fa9869aec5..ec531de6 100644 > --- a/arch/powerpc/lib/mem_64.S > +++ b/arch/powerpc/lib/mem_64.S > @@ -13,6 +13,23 @@ > #include > #include > > +_GLOBAL(__memset16) > + rlwimi r4,r

[PATCH 1/2] powerpc: string: implement optimized memset variants

2017-03-27 Thread Naveen N. Rao
Based on Matthew Wilcox's patches for other architectures. Signed-off-by: Naveen N. Rao --- arch/powerpc/include/asm/string.h | 24 arch/powerpc/lib/mem_64.S | 19 ++- 2 files changed, 42 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/inc