[perf-discuss] A metric for overall system utilization

2011-05-31 Thread Kishore Kumar Pusukuri
Hi, Could you please provide me some metrics used for measuring overall system utilization (e.g. a NUMA machine running Solaris 10)? Is it okay to use overall CPU utilization metric such as "usr" under mpstat(1) output? Please let me know. Thanks. -- This message posted from opensolaris.org

[perf-discuss] Overall system utilization vs resource utilization by an application

2011-05-30 Thread Kishore Kumar Pusukuri
Hi, I would like to measure both system utilization and application (multithreaded application) progress on a 8-core machine running Solaris 10. I am using prstat(1) for measuring the progress of the application in terms of USR and planning to use mpstat(1) for mesauring overall system utilizati

[perf-discuss] Dramatic Performance Degradation with Binding

2011-02-11 Thread Kishore Kumar Pusukuri
Hi, I am running a multithreaded application with 20 threads on my 24-core AMD Opteron (ccNUMA) machine running Solaris 10. When I run the application with threads binding to cores using pbind (one-thread to one-core), its performance is dramatically degrading. It is around 80% performance lo

[perf-discuss] Running OpenMP program

2011-02-02 Thread Kishore Kumar Pusukuri
Hi, I have been compiling and running OpenMP parallel programs successfully so far on my OpenSolaris.2009.10 machine. However I am unable to run one program with more than one thread(lwp). Please see details of the program below. Please let me know if there is anything wrong with the compilation

[perf-discuss] system traps with FX scheduling policy

2011-01-28 Thread Kishore Kumar Pusukuri
Hi, I am playing with FX scheduling policy with different time-quanta on SPECOMP multithreaded programs. I am using "prstat -Lm" to analyze the effect of different time-quanta on the performance of the programs. Most of the programs experience "system traps" (TRP) with FX 10ms time-quantum. H

[perf-discuss] Microbenchmarks for synchronization mechanisms

2011-01-26 Thread Kishore Kumar Pusukuri
Hi, Could you help me to find/develop microbenchmarks for stressing "synchronization mechanisms" of OpenSolaris? Thank you. -- This message posted from opensolaris.org ___ perf-discuss mailing list perf-discuss@opensolaris.org

[perf-discuss] Fixed-Priority Scheduling Policy vs Round-Robin Scheduling

2010-11-16 Thread Kishore Kumar Pusukuri
Does Fixed-Priority Scheduling Policy function similar to Round-Robin scheduling with fixed time-quantum (somewhat like SCHED_RR in Linux) ? Please let me know. Thank you. -- This message posted from opensolaris.org ___ perf-discuss mailing list perf

[perf-discuss] Using collect utility

2010-08-27 Thread Kishore Kumar Pusukuri
Hi, I am trying to play with the utility "collect" for studing TLB misses of a multi-threaded program running on my AMD multi-core machine equipped with OpenSolaris.2009.06. However, the program is hanging (with and also without umask) on when I used collect utility. Please find the "prstat -m"

[perf-discuss] Using madvise(3C)

2010-08-27 Thread Kishore Kumar Pusukuri
Hi, I am trying to play with madvise on my AMD machine running OpenSolaris.2009.06. However, getting the following error when I used to compile the below program with /usr/sfw/bin/g++. Please help me to resolve this. I am not sure whether the usage of madvise is correct or not? Please let me kno

[perf-discuss] Using madvise(3C)

2010-08-27 Thread Kishore Kumar Pusukuri
Hi, I am trying to play with madvise on my AMD machine running OpenSolaris.2009.06. However, getting the following error when I used to compile the below program with /usr/sfw/bin/g++. Please help me to resolve this. 457: error: `madvise' undeclared (first use this function) 457: error: (Eac

[perf-discuss] Performance Degradation with 1GB Pages for Heap

2010-08-20 Thread Kishore Kumar Pusukuri
Hi, My AMD Opteron supports 4KB, 2MB and 1GB page sizes. I observed that there is performance improvement (reduced elapsed time) for some multi-threaded applications when I used 2MB page-size for heap. These applications need around 650MB heap (it reads a huge file of around 650MB size). However

[perf-discuss] Measuring cost of TLB misses

2010-08-20 Thread Kishore Kumar Pusukuri
Hi, I am able to measure TLB miss-rate of a multi-threaded application running on my multi-core AMD Opteron machine by reading performance monitoring event counters using cpustat utility. However, I would like to measure the amount of time spent on TLB misses? Specifically, I am looking a way li

[perf-discuss] 64-bit vs 32-bit applications

2010-08-16 Thread Kishore Kumar Pusukuri
Hi, I am surprised with the performances of some 64-bit multi-threaded applications on my AMD Opteron machine. For most of the applications, the performance of 32-bit version is almost same as the performance of 64-bit version. However, for a couple of applications, 32-bit versions provide bette

[perf-discuss] Mapping the kernel heap with large pages

2010-08-13 Thread Kishore Kumar Pusukuri
Hi, One of my applications is spending around 90% of total execution time reading a huge file using read system call. I though that I could improve the performance of the application by increasing the page size for kernel heap. I know that I can increase page size of application heap using ppgs

[perf-discuss] Change segvn cache size

2010-08-13 Thread Kishore Kumar Pusukuri
Hi, I observed that one multi-threaded application is generating so many cross-calls (xcalls) on my AMD multi-core machine. A snapshot of stack trace is shown below. I think that this is because of "segvn" activity, i.e. unmapping the page and generating cross-call activity to maintain MMU leve

[perf-discuss] libmtmalloc vs libumem

2010-08-13 Thread Kishore Kumar Pusukuri
Hi, I am able to understand how libmtmalloc works from the documentation of libmtmalloc.c source file. However, I am unable to find proper documentation for libumem. Could someone provide the key differences between libmtmalloc and liumem, please? Please also provide me links to the documentatio

[perf-discuss] Segmentation fault when using libhoard memory allocat and getting core dump

2010-08-10 Thread Kishore Kumar Pusukuri
I am trying to see the impact of different memory allocators on multi-threaded workloads on my AMD machine running OpenSolaris 2009.06. I successfully used libmtmalloc and libumem, however, it is giving core dump (through SEG fault) when I used libhoard_32.so. However, I didn't get any errors w

[perf-discuss] Using multiple page sizes

2010-08-09 Thread Kishore Kumar Pusukuri
I would like to see the impact of different page sizes on the performance of multi-threaded applications. However, the pagesize -a command is producing only 3 possible page sizes including the default 4Kb on my AMD machine (shown below). Are these only page sizes I can use? (or) Is there anyway