On Wed, 2013-05-29 at 12:26 -0700, Andrew Morton wrote: > On Wed, 22 May 2013 16:37:18 -0700 Tim Chen <tim.c.c...@linux.intel.com> > wrote: > > > Currently the per cpu counter's batch size for memory accounting is > > configured as twice the number of cpus in the system. However, > > for system with very large memory, it is more appropriate to make it > > proportional to the memory size per cpu in the system. > > > > For example, for a x86_64 system with 64 cpus and 128 GB of memory, > > the batch size is only 2*64 pages (0.5 MB). So any memory accounting > > changes of more than 0.5MB will overflow the per cpu counter into > > the global counter. Instead, for the new scheme, the batch size > > is configured to be 0.4% of the memory/cpu = 8MB (128 GB/64 /256), > > which is more inline with the memory size. > > I renamed the patch to "mm: tune vm_committed_as percpu_counter > batching size". > > Do we have any performance testing results? They're pretty important > for a performance-improvement patch ;) >
I've done a repeated brk test of 800KB (from will-it-scale test suite) with 80 concurrent processes on a 4 socket Westmere machine with a total of 40 cores. Without the patch, about 80% of cpu is spent on spin-lock contention within the vm_committed_as counter. With the patch, there's a 73x speedup on the benchmark and the lock contention drops off almost entirely. Tim -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/