Re: [perf-discuss] NUMAtop for OpenSolaris

johansen Tue, 12 Jan 2010 17:37:25 -0800

On Wed, Jan 13, 2010 at 09:17:47AM +0800, Li, Aubrey wrote:
> johansen wrote:
> >
> >On Tue, Jan 12, 2010 at 02:20:02PM +0800, zhihui Chen wrote:
> >> Application can be categoried into CPU-sensitive, Memory-sensitive,
> >> IO-sensitive.
> >
> >My concern here is that unless the customer knows how to determine
> >whether his application is CPU, memory, or IO sensitive it's going to be
> >hard to use the tools well.
> >
> 
> "sysload" in NUMAtop can tell the customer if the app is cpu sensitive.
> "Last Level Cache Miss per Instruction" will be added into NUMAtop to 
> determine if the app is memory sensitive.


Ok, that makes sense.  Thanks for those additions.

> Thanks to point this issue out. We are not SPARC expert and I think SPARC
> NUMAtop design is not in our phase I design, :)
> We hope the SPARC expert like you or other expert can take SPARC into 
> account and extend this tool onto SPARC platform.

That does raise an interesting question, though.  Does SPARC have CPU
counters that you could port the tool to?  If not, what counters would
you need added in order for this to work?

> As for the metric of NUMAtop, the memory access latency is a good idea.
> But the absolute amount is not a good indicator for NUMAtop. This amount
> will be different on different platforms, a specific number of amount is
> good on one platform while it's bad on another one. It's hard to tell the
> customer what data is good. So we will introduce a ratio into NUMAtop, 
> 
> "LLC Latency ratio" = 
> "the actual memory access latency" / "calibrated local memory access latency"
> 
> We assume different node hop has different memory access latency,
> longer distance node hop has the longer memory access latency. This
> ratio will be near to 1 if most of the memory access of the
> application is to the local memory.

This is reasonable, but are you able to leverage any of the ACPI data
that's contained in the x86 BIOS?  I know Jonathan Chew did a bunch of
work to read the various latency tables.  Could you take advantage of
the work that he has already done to determine how long RMA to various
hops would take?

> So as a conclusion, here we propose the metrics of NUMAtop
> 1) sysload    -  cpu sensitive
> 2) LLC Miss per Instruction - memory sensitive
> 3) LLC Latency ratio - memory locality
> 4) the percent of the number of LMA/RMA access / total memory access
> - 4.1) LMA/(total memory access)%
> - 4.2) RMA/(total memory access)%

This seems like an improvement.  Thanks. :)

> BTW: Do we still need one more +1 vote for NUMAtop project? 

You have one from me and one from esaxe. I think you still need one
more.

-j
_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org

Re: [perf-discuss] NUMAtop for OpenSolaris

Reply via email to