Take a look at lmbench
http://www.bitmover.com/lmbench/
This reports memory latency. You can get an estimate of the cost of TLB
handling by subtracting the cost of a cache miss where there is no TLB
miss from the cost of a cache miss where there is a TLB miss.
If you want to do this independe
IMO, this problem is harder than "just porting trapstat to x86/x64" --
if you google trapstat+t2, or trapstat+ultrasparc+t2, you will find
that measuring the cost of TLB misses is equally hard on UltraSPARC T2
as well, as both x86/x64 and T2 support hardware table walk.
Rayson
On Sat, Aug 21, 2