Yes! That's my experience as well -- many benchmarks measure something, but it's often not clear exactly *what* they measure. Here's an example: several of the subtests of SPECcpu turn out to be heavily dependent on malloc() -- not the *performance* of malloc() calls per se (which is what a micro-benchmark would measure), but the pattern of addresses that malloc returns, affecting cache performance. More info on this from Bart shortly (he's looking at these malloc differences)!
However, despite the difficulties in methodology, I do think that it's possible (and useful) to use such "complicated" benchmarks to probe the OS for performance differences, as long as we keep the compiled code the same (this eliminates compiler tuning as an issue). Doing this, we've discovered three interesting differences between us and Linux so far. 1) Solaris malloc() doesn't pack as nicely for some apps, 2) the Solaris scheduler is not quite as "sticky" on SMP machines as the Linux scheduler is (especially for single process workloads, e.g. SPECcpu base), and 3) segkpm on Opteron makes a big difference on file I/O. I guess the bottom line for me is yes, the more complex benchmarks frequently measure something unexpected, but root-causing the unexpected differences can give us a *lot* of useful info, that helps us figure out what needs changing in the OS itself. This message posted from opensolaris.org _______________________________________________ perf-discuss mailing list perf-discuss@opensolaris.org