Re: HZ=100: not necessarily better?

Robert Watson Sat, 17 Jun 2006 17:22:17 -0700


On Sat, 17 Jun 2006, Danial Thom wrote:

At some point you're going to have to figure out that there's a reason thatevery time anyone other than you tests FreeBSD it completely pigs out.Sqeezing out some extra bytes in netperf isn't "performance". Performance iseverything that a system can do. If you're eating 10% more cpu to get a fewmore bytes in netperf, you haven't increased the performance of the system.

This test wasn't netperf, it was a 32-process web server and a 32-processclient, doing sendfile on UFS-backed data files. It was definitely a pottedbenchmark, in that it omits some of the behaviors of web servers (dynamiccontent, significantly variable data set, etc), but is intended to be morethan a simple micro-benchmark involving two sockets and packet blasting.Specifically, it was intended to validate whether or not there wereimmediately observable changes in TCP behavior based on adjusting HZ underload. The answer was a qualified yes: there was a small but noticeablenegative affect on high load web serving in the test environment by reducingHZ, likely due to to reduced timer accuracy. Specifically: simply frobbing HZisn't a strategy that necessarily results in a performance improvement.

You need to do things like run 2 benchmarks at once. What happens to the"performance" of one benchmark when you increase the "performance" of theother? Run a database benchmark while you're running a network benchmark, orwhile you're passing a controlled stream of traffic through the box.

The point of this exercise was to demonstrate the complexity of the issue ofadjusting HZ, and to suggest that simply changing the value in the furtherabsense of evidence could have negative effects, and that we might want toinvestigate a more mature middle ground, such as a modified timerarchitecture. I'm sorry if that conclusion wasn't clear from my e-mail.

I'd also love to see the results of the exact same test with only 1 cpuenabled, to see how well you scale generally. I'm astounded that no-one everseems to post 1 vs 2 cpu performance, which is the entire point of SMP.

Single CPU results were included in my e-mail. There are actually a couple ofother variations of interest you want to measure in more general benchmarkingexercises:


- Kernel compiled without any SMP support.  Specifically, without lock
  prefixes on atomic instructions.

- Kernel compiled with SMP support, but with use of additional CPUs disabled.

- Kernel compiled with SMP support, and with varying numbers of CPUs enabled.

The first two cases are important, because they help identify the differencebetween the general overhead of compiling in locked instructions (and relatedissues), and the overheads associated with contention, caches, inter-CPU IPItraffic, scheduling, etc. By failing to compare the top to cases, it might beeasy to conclude that a performance improve is due to the additional cost ofatomic instructions, whereas in reality it may be the result of a poorscheduling decision, or of data unnecessarily cache missing in both CPUsratherthan one because processing of the data is split poorly over available CPUs.


Robert N M Watson
Computer Laboratory
University of Cambridge
_______________________________________________
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: HZ=100: not necessarily better?

Reply via email to