Re: [perf-discuss] NUMAtop for OpenSolaris

Li, Aubrey Wed, 20 Jan 2010 00:42:56 -0800

Jonathan Chew wrote:
>
>Thanks for summarizing the metrics.  However, I wanted to see a summary
>of the overall NUMAtop proposal given the feedback that you have gotten,
>so I can understand what the project is proposing to do now that you
>have gotten feedback.  Then I can decide whether I have anything to add
>and whether I want to approve it as is or not.
>
> From the email thread so far, it looks as though Krish gave a very
>brief description of the project, Jin Yao explained some phases for the
>project, and you have listed some proposed metrics for the tool
>
>Have anything of these changed given the feedback that you have gotten?
>Can you please summarize your latest project proposal including the
>description, phases, metrics, and anything else that is useful for
>understanding what the project is proposing to do?
>
>
>Jonathan


NUMAtop focus on NUMA-related characteristic, it's a tool to help developers
identify memory locality in NUMA systems. The tool is top-like that shows
the top N processes in the system and their memory locality, with those 
processes
that have the worst memory locality will be at the top of the list, it can 
attach into a process to show the threads memory locality in the top style as 
well.

The information NUMAtop reported is collected from memory-related hardware
counters and libcpc Dtrace provider. Some of these counters are already 
supported
in kcpc and libcpc, while some of them are not. Intel Nehalem-based and 
next-generation platform provide memory load latency event, which is an 
important approach of NUMAtop and needs PEBS framework solaris implementation.

The following proposed metrics will be one part of our phase I job.
Application can be classified into CPU-sensitive, Memory-sensitive, 
IO-sensitive.
IO-sensitive application can be idendified by low CPU utilization. 
Memory-sensitive
application should be CPU-sensitive application with high CPU utilization.

So we have the following metrics:

1) sysload      -  cpu sensitive
2) LLC Miss per Instruction - memory sensitive

After we figure out the application is memory-sensitive, we'll check memory 
locality
metrics to see what is the performance regression cause.

3) LLC Latency Ratio(Average Latency for LLC Miss/Local Memory Access Latency)
4) Source distribution for LLC miss:
  -4.1)LMA/(Total LLC Miss Retired)%
  -4.2)RMA/(Total LLC Miss Retired)%

Here, 4.2) could be separated into different % onto different NUMA node hop.

NUMAtop should have a useful report to show how effective the application is 
using the
local memory. We need PEBS framework to implement the metrics of NUMATOP, We 
need MPO
sponsor and libcpc dtrace provider sponsor to figure out where is not effective 
and why.
A better memory placement strategy suggestion is also a valuable goal of 
NUMATOP.

Thanks,
-Aubrey
_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org

Re: [perf-discuss] NUMAtop for OpenSolaris

Reply via email to