I am toying around with a T1000 machine (T1 1GHz processor, 8 cores, 4-threads 
per core, 8GB RAM). I was unable to saturate a single Gigabit NIC with netperf, 
so I started investigating with the help of performance counters. It turns out 
that even a simple for loop that only increments a counter can do at most 250 
million instructions per second (hardly any cache/TLB misses, as expected). 
From my understanding of the Niagara architecture, a single thread executing on 
a core should be able to fully utilise it (1 billion instructions per second in 
my case).

What am I missing?

Thanks,
Elad

P.S.,
I am tracking performance with cputrack -c Instr_cnt,sys
 
 
This message posted from opensolaris.org
_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org

Reply via email to