On Wed, Jan 13, 2010 at 09:17:47AM +0800, Li, Aubrey wrote: > johansen wrote: > > > >On Tue, Jan 12, 2010 at 02:20:02PM +0800, zhihui Chen wrote: > >> Application can be categoried into CPU-sensitive, Memory-sensitive, > >> IO-sensitive. > > > >My concern here is that unless the customer knows how to determine > >whether his application is CPU, memory, or IO sensitive it's going to be > >hard to use the tools well. > > > > "sysload" in NUMAtop can tell the customer if the app is cpu sensitive. > "Last Level Cache Miss per Instruction" will be added into NUMAtop to > determine if the app is memory sensitive.
Ok, that makes sense. Thanks for those additions. > Thanks to point this issue out. We are not SPARC expert and I think SPARC > NUMAtop design is not in our phase I design, :) > We hope the SPARC expert like you or other expert can take SPARC into > account and extend this tool onto SPARC platform. That does raise an interesting question, though. Does SPARC have CPU counters that you could port the tool to? If not, what counters would you need added in order for this to work? > As for the metric of NUMAtop, the memory access latency is a good idea. > But the absolute amount is not a good indicator for NUMAtop. This amount > will be different on different platforms, a specific number of amount is > good on one platform while it's bad on another one. It's hard to tell the > customer what data is good. So we will introduce a ratio into NUMAtop, > > "LLC Latency ratio" = > "the actual memory access latency" / "calibrated local memory access latency" > > We assume different node hop has different memory access latency, > longer distance node hop has the longer memory access latency. This > ratio will be near to 1 if most of the memory access of the > application is to the local memory. This is reasonable, but are you able to leverage any of the ACPI data that's contained in the x86 BIOS? I know Jonathan Chew did a bunch of work to read the various latency tables. Could you take advantage of the work that he has already done to determine how long RMA to various hops would take? > So as a conclusion, here we propose the metrics of NUMAtop > 1) sysload - cpu sensitive > 2) LLC Miss per Instruction - memory sensitive > 3) LLC Latency ratio - memory locality > 4) the percent of the number of LMA/RMA access / total memory access > - 4.1) LMA/(total memory access)% > - 4.2) RMA/(total memory access)% This seems like an improvement. Thanks. :) > BTW: Do we still need one more +1 vote for NUMAtop project? You have one from me and one from esaxe. I think you still need one more. -j _______________________________________________ perf-discuss mailing list perf-discuss@opensolaris.org