On Thursday 31 July 2008 03:34, Andrew Morton wrote: > On Wed, 30 Jul 2008 18:23:18 +0100 Mel Gorman <[EMAIL PROTECTED]> wrote: > > On (30/07/08 01:43), Andrew Morton didst pronounce: > > > On Mon, 28 Jul 2008 12:17:10 -0700 Eric Munson <[EMAIL PROTECTED]> wrote: > > > > Certain workloads benefit if their data or text segments are backed > > > > by huge pages. > > > > > > oh. As this is a performance patch, it would be much better if its > > > description contained some performance measurement results! Please. > > > > I ran these patches through STREAM (http://www.cs.virginia.edu/stream/). > > STREAM itself was patched to allocate data from the stack instead of > > statically for the test. They completed without any problem on x86, > > x86_64 and PPC64 and each test showed a performance gain from using > > hugepages. I can post the raw figures but they are not currently in an > > eye-friendly format. Here are some plots of the data though; > > > > x86: > > http://www.csn.ul.ie/~mel/postings/stack-backing-20080730/x86-stream-stac > >k.ps x86_64: > > http://www.csn.ul.ie/~mel/postings/stack-backing-20080730/x86_64-stream-s > >tack.ps ppc64-small: > > http://www.csn.ul.ie/~mel/postings/stack-backing-20080730/ppc64-small-str > >eam-stack.ps ppc64-large: > > http://www.csn.ul.ie/~mel/postings/stack-backing-20080730/ppc64-large-str > >eam-stack.ps > > > > The test was to run STREAM with different array sizes (plotted on X-axis) > > and measure the average throughput (y-axis). In each case, backing the > > stack with large pages with a performance gain. > > So about a 10% speedup on x86 for most STREAM configurations. Handy - > that's somewhat larger than most hugepage-conversions, iirc.
Although it might be a bit unusual to have codes doing huge streaming memory operations on stack memory... We can see why IBM is so keen on their hugepages though :) > Do we expect that this change will be replicated in other > memory-intensive apps? (I do). Such as what? It would be nice to see some numbers with some HPC or java or DBMS workload using this. Not that I dispute it will help some cases, but 10% (or 20% for ppc) I guess is getting toward the best case, short of a specifically written TLB thrasher. _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev