On 01/30/09 01:40 PM, Elad Lahav wrote: >> I'm not sure that I follow your argument. The T1000's architecture >> favors workloads that have many parallel tasks that involve data >> throughput. The Xenon is going to have a better showing for straight >> number-crunching work. If your webserver benchmark is trying to measure >> the throughput for a few clients that perform simple tasks, then I can >> understand why you might expect the Xenon to do better. However, if >> you're trying to measure a workload that has many clients and measures >> in ops/sec, the T1000 may well do better. > > This is exactly why I created this benchmark. On the T1000 it uses 32 threads > to encrypt > the file in parallel - exactly the situation this machine is supposed to > shine in. From > the performance counters I can tell that the CPU is maxed, for a total of > close to 8 > billion operations per second, so that there is almost no internal stalling. > My point was that even in this situation the UltraSPARC T1 lags considerably > behind the > quad-core Xeon. >
Hi, I don't think that follows. You've put together a compute intensive benchmark - so the constraint on that code is how many instructions per second you can execute. It doesn't follow that it is analogous to a webserver. A webserver is not generally so compute intensive. If there's lot's of memory stall, then the T1/T2 will use that stall time to get useful work done on other threads, most other processors will have idle pipelines until the stall resolves. [A benchmark for this situation would be something like running multiple copies of the latency measurements in lmbench.] This is a workload characterisation question. Just because platform A under performs platform B on workload C it does not follow that it will also under perform on workload D. Looking at the specweb results - which are probably the benchmarks closest to the situation that you are trying to model - your expectation may not be that far out of line (I cannot see any results submitted using a Xeon E5440). http://www.spec.org/web2005/results/web2005.html Looking through the results, and trying to find stuff that's vaguely comparable: Xeon E3360 => 18495 2x Xeon E5460 => 28127 T2000 => 16407 (T2000 = 2RU version of T1000) T1000 => 10466 T5220 => 41847 (UltraSPARC T2 processor) From this I'd expect figures in the same ball park from the T1 and the Xeon for webserving. The main point I'd make is that a compute intensive benchmark is not necessarily a good proxy for a webserver. Regards, Darryl. > --Elad > _______________________________________________ > perf-discuss mailing list > perf-discuss@opensolaris.org -- Darryl Gove Compiler Performance Engineering Blog: http://blogs.sun.com/d/ Book: http://www.sun.com/books/catalog/solaris_app_programming.xml _______________________________________________ perf-discuss mailing list perf-discuss@opensolaris.org