Jonathan Chew wrote: > >Thanks for summarizing the metrics. However, I wanted to see a summary >of the overall NUMAtop proposal given the feedback that you have gotten, >so I can understand what the project is proposing to do now that you >have gotten feedback. Then I can decide whether I have anything to add >and whether I want to approve it as is or not. > > From the email thread so far, it looks as though Krish gave a very >brief description of the project, Jin Yao explained some phases for the >project, and you have listed some proposed metrics for the tool > >Have anything of these changed given the feedback that you have gotten? >Can you please summarize your latest project proposal including the >description, phases, metrics, and anything else that is useful for >understanding what the project is proposing to do? > > >Jonathan
NUMAtop focus on NUMA-related characteristic, it's a tool to help developers identify memory locality in NUMA systems. The tool is top-like that shows the top N processes in the system and their memory locality, with those processes that have the worst memory locality will be at the top of the list, it can attach into a process to show the threads memory locality in the top style as well. The information NUMAtop reported is collected from memory-related hardware counters and libcpc Dtrace provider. Some of these counters are already supported in kcpc and libcpc, while some of them are not. Intel Nehalem-based and next-generation platform provide memory load latency event, which is an important approach of NUMAtop and needs PEBS framework solaris implementation. The following proposed metrics will be one part of our phase I job. Application can be classified into CPU-sensitive, Memory-sensitive, IO-sensitive. IO-sensitive application can be idendified by low CPU utilization. Memory-sensitive application should be CPU-sensitive application with high CPU utilization. So we have the following metrics: 1) sysload - cpu sensitive 2) LLC Miss per Instruction - memory sensitive After we figure out the application is memory-sensitive, we'll check memory locality metrics to see what is the performance regression cause. 3) LLC Latency Ratio(Average Latency for LLC Miss/Local Memory Access Latency) 4) Source distribution for LLC miss: -4.1)LMA/(Total LLC Miss Retired)% -4.2)RMA/(Total LLC Miss Retired)% Here, 4.2) could be separated into different % onto different NUMA node hop. NUMAtop should have a useful report to show how effective the application is using the local memory. We need PEBS framework to implement the metrics of NUMATOP, We need MPO sponsor and libcpc dtrace provider sponsor to figure out where is not effective and why. A better memory placement strategy suggestion is also a valuable goal of NUMATOP. Thanks, -Aubrey _______________________________________________ perf-discuss mailing list perf-discuss@opensolaris.org