Hans Petter Selasky wrote this message on Wed, May 13, 2015 at 10:35 +0200:
> On 05/13/15 10:27, David Chisnall wrote:
> > On 13 May 2015, at 09:03, John-Mark Gurney <[email protected]> wrote:
> >>
> >> Poul-Henning Kamp wrote this message on Tue, May 12, 2015 at 06:31 +0000:
> >>> --------
> >>> In message <[email protected]>, John-Mark Gurney writes:
> >>>
> >>>> Also, you'd probably see even better performance by increasing the
> >>>> size to 64k, [...]
> >>>
> >>> easy:
> >>> 8K on 32bit
> >>> 64k on 64bit
> >>
> >> Sounds good to me... Just for people who care... I did a quick set of
> >> benchmarks on sha256.. This is using my preliminary patch to use sse4
> >> optimized sha256... But this should be the same for others...
> >>
> >> The numbers in ministat output are the time in seconds it takes my
> >> 3.4GHz AMD A10-5700 APU running HEAD to process a 512MB file, so lower
> >> numbers are better.. I've processed them into easier to read format:
> >> BUFSIZ: 145MB/sec
> >> 8k: 193MB/sec
> >> 16k: 198MB/sec
> >> 64k: 202MB/sec
> >> 128k: 202MB/sec
> >> -t: 211MB/sec
> >
> > It looks like most of the benefit is gained at 16KB. Did you try running
> > the benchmark with something else running at the same time to see if there
> > is any advantage in trashing the caches a bit less (simple case, what
> > happens if you run two instances of the same benchmark at once)?
> >
> > I suspect that you???re about right anyway - I recently did some tests
> > while playing with JavaScript FFI generation with a multithreaded process
> > JavaScript environment calling out to OpenSSL to do SHA calculations and
> > having each of 8 threads reading in 128KB chunks gave the fastest
> > performance (Core i7, 4 cores + hyperthreading), with only a negligible
> > gain over 64KB. In all cases, the JavaScript implementation was
> > significantly faster than the openssl tool, which used 8KB buffers.
>
> You should also try this using an USB disk. The performance numbers
> heavily depends on the hardware's interrupt moderation values.
This shouldn't matter.. I wasn't flushing the buffer cache between
runs, so this was entirely from the buffer cache... This is purely,
syscall+copy overhead that is being measured here... No matter what
you're source is, NFS, USB disk, you'll always have this overhead...
--
John-Mark Gurney Voice: +1 415 225 5579
"All that I will do, has been done, All that I have, has not."
_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[email protected]"