On 28 July 2014 13:37, Ryan Stone <ryst...@gmail.com> wrote: > On Sun, Jul 27, 2014 at 4:42 PM, George Neville-Neil > <g...@neville-neil.com> wrote: >> Chiming in late, but don't you mean instruction-retired instead of >> CPU_CLK_UNHALTED_CORE? >> >> Best, >> George > > In my experience instruction-retired gives very misleading profiler > output in most cases. The problem is that instruction-retired gives > equal weight to all instructions, which means that it does not take > into account instructions with long latencies because they (for > example) missed the cache. CPU_CLK_UNHALTED_CORE (or its alias, > unhalted-cycles) is a much better event because it is a nearer proxy > for time-based sampling, which is really what you're interested in > when trying to reduce runtime of processes.
Right. It is a union of all the things that screw with you - frontend stall, backend/retire stall, microcode operation stall, FPU length stall, branch misprediction stalls, L3 miss (ie, memory) stall, cache ping-ponging stalls. Figuring out -which- of those above are the problem requires a little further digging. > My one big complaint with unhalted-cycles is that it does not take > into effect CPU time spent in busy-wait loops that use the pause > instruction, so it vastly unweights time spent adaptively spinning on > kernel mutexes, for instance. Well, it depends if you want to know about the places that it's spending in busy-wait loops using PAUSE or not. (Are there any flags / modifiers that have the CPU not count that?) > I'm also not sure what it does when the > CPU is adjusting its frequency, but that's not a case that I ever have > to deal with personally. That's the difference between _CORE and _REF. -a _______________________________________________ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"