Re: [perf-discuss] Performance of 32-bit vs 64-bit benchmark

Elad Lahav Fri, 30 Jan 2009 14:38:08 -0800

> I don't think that follows.
> 
> You've put together a compute intensive benchmark - so the constraint on 
> that code is how many instructions per second you can execute.
> 
> It doesn't follow that it is analogous to a webserver. A webserver is 
> not generally so compute intensive. If there's lot's of memory stall, 
> then the T1/T2 will use that stall time to get useful work done on other 
> threads, most other processors will have idle pipelines until the stall 
> resolves. [A benchmark for this situation would be something like 
> running multiple copies of the latency measurements in lmbench.]
> 
> This is a workload characterisation question. Just because platform A 
> under performs platform B on workload C it does not follow that it will 
> also under perform on workload D.


I never claimed the compute-intensive benchmark is indicative of web-server 
performance. I 
was claiming (and I may be wrong) that the difference between the T1 and the 
Xeon on this 
benchmark is supposed to give a lower bound on the difference in any other 
workload, since 
it is heavily biased in favour of the T1's strengths. If anything, I would 
expect the 
difference between the two processors to grow once we exit the realm of 
perfectly-parallelised, integer-compute-intensive applications.

 > It doesn't follow that it is analogous to a webserver. A webserver is not 
 > generally so
 > compute intensive. If there's lot's of memory stall, then the T1/T2 will use 
 > that stall
 > time to get useful work done on other threads, most other processors will 
 > have idle
 > pipelines until the stall resolves. [A benchmark for this situation would be 
 > something
 > like running multiple copies of the latency measurements in lmbench.]

The quad-core Xeon would compensate for the stalling with superscalar 
mechanisms.

I am not trying to argue against the T1 in any way. This specific Xeon 
processor is 2 
years newer, runs faster and has a much larger L2 cache than the T1. I suspect 
that it 
also consumes much more energy and produces more heat.

The only point of this exercise was to assess, based on the results of both the 
macro and 
micro benchmarks, whether it would be worthwhile to invest more time in 
optimising and 
tuning the T1000 to get comparable SPECweb results to those of the Xeon 
machine. I think 
that the answer is that it will do no use to tune it further, since there is an 
inherent 
advantage to the Xeon in all performance measurements. Again, I may be wrong, 
and would be 
glad to hear different opinions.

Thanks,
--Elad
_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org

Re: [perf-discuss] Performance of 32-bit vs 64-bit benchmark

Reply via email to