Re: [perf-discuss] On the uses of semaphores sets

2010-09-16 Thread johansen
On Thu, Sep 16, 2010 at 01:17:27PM -0500, Jason wrote: > On Thu, Sep 16, 2010 at 12:46 PM, Louis Munro wrote: > > I also understand that when a set is full a request for an aditional > > semaphore will be allocated form another set if there is one available. > > Again, is that right? > > Not sur

Re: [perf-discuss] [on-discuss] Syscall for posix_spawn?

2010-07-14 Thread johansen
On Wed, Jul 14, 2010 at 06:29:14PM -0500, Nicolas Williams wrote: > On Thu, Jul 15, 2010 at 01:27:06AM +0200, ? wrote: > > A fairly simple - but not perfectly accurate - test case would be to > > run ksh93 on Linux and Solaris and let it call /bin/true in a loop. > > > > The numbe

Re: [perf-discuss] [on-discuss] Syscall for posix_spawn?

2010-07-14 Thread johansen
On Wed, Jul 14, 2010 at 02:37:34PM -0400, James Carlson wrote: > I think having the original poster come up with a test case of interest > (something that runs significantly faster on "all other platforms") and > then gathering basic data on that test case (where does a profiler say > the system is

Re: [perf-discuss] [on-discuss] Syscall for posix_spawn?

2010-07-14 Thread johansen
On Wed, Jul 14, 2010 at 12:32:26AM -0500, Nicolas Williams wrote: > One possible improvement that comes to mind would be a private vforkx() > flag to request that threads in the parent other than the one calling > vforkx() not be stopped. The child would have to be extremely careful > to not call

Re: [perf-discuss] [on-discuss] Syscall for posix_spawn?

2010-07-14 Thread johansen
On Wed, Jul 14, 2010 at 01:12:37PM -0500, Nicolas Williams wrote: > On Wed, Jul 14, 2010 at 11:06:38AM -0700, johan...@opensolaris.org wrote: > > On Tue, Jul 13, 2010 at 06:41:32PM -0500, Nicolas Williams wrote: > > > As to your question, posix_spawn(2) wouldn't fork at all under the > > > covers:

Re: [perf-discuss] [on-discuss] Syscall for posix_spawn?

2010-07-14 Thread johansen
On Wed, Jul 14, 2010 at 05:09:22AM +0200, ? wrote: > Reducing the *number* of system calls is not the solution. The problem > is that vfork and exec don't know that they are called in a sequence. Please see the manpage for vfork(2). They most certainly do know that they're called

Re: [perf-discuss] Kernel overhead and idle time in SMP virtual guest

2010-06-24 Thread johansen
On Thu, Jun 24, 2010 at 08:25:10AM +0200, Andrej Podzimek wrote: > Yes, the amount of idle and system time is *incredible*. Something > must be wrong. More than 20 running processes are reported by 'top' > most of the time, but at most 4 to 6 are on CPU at any given moment. What kind of workload a

Re: [perf-discuss] clarification regarding the need to call port_dissociate()

2010-05-18 Thread johansen
Hi Nils, On Tue, May 18, 2010 at 04:00:21PM +0200, Nils Goroll wrote: > I'm analyzing a core dump where port_associate() failed with EAGAIN. My > understanding is that the number of associated fds has reached the > process.max-port-events limit. You can get EAGAIN in a couple of different ways.

Re: [perf-discuss] pid$target: Not enough space

2010-05-04 Thread johansen
On Tue, May 04, 2010 at 03:23:09PM -0700, Jianhua Yang wrote: > I wanted to find out what the single threaded process was doing with dtrace > but it returned with "Not enough space"error: > > # dtrace -ln 'pid$target:::entry,pid$target:::return {trace(timestamp);}' -p > 25050 > dtrace: invalid p

Re: [perf-discuss] Changing the default buffer sizes for pipes ?

2010-03-29 Thread johansen
On Tue, Mar 30, 2010 at 02:47:14AM +0200, Chris Pickett wrote: > Re port to Solaris: No, I will not port this to Solaris. I've been > told that making ulimit's -p a tunable will not pass ARC review and I > am not going to write patches just to throw them away because ARC > doesn't like it. That sa

Re: [perf-discuss] NUMAtop for OpenSolaris

2010-01-12 Thread johansen
On Wed, Jan 13, 2010 at 09:17:47AM +0800, Li, Aubrey wrote: > johansen wrote: > > > >On Tue, Jan 12, 2010 at 02:20:02PM +0800, zhihui Chen wrote: > >> Application can be categoried into CPU-sensitive, Memory-sensitive, > >> IO-sensitive. > > > >My conce

Re: [perf-discuss] NUMAtop for OpenSolaris

2010-01-12 Thread johansen
On Tue, Jan 12, 2010 at 02:20:02PM +0800, zhihui Chen wrote: > Application can be categoried into CPU-sensitive, Memory-sensitive, > IO-sensitive. My concern here is that unless the customer knows how to determine whether his application is CPU, memory, or IO sensitive it's going to be hard to use

Re: [perf-discuss] NUMAtop for OpenSolaris

2010-01-05 Thread johansen
On Tue, Jan 05, 2010 at 04:27:03PM +0800, Li, Aubrey wrote: > >I'm concerned that unless we're able to demonstrate some causal > >relationship between RMA and reduced performance, it will be hard for > >customers to use the tools to diagnose problems. Imagine a situation > >where the application i

Re: [perf-discuss] NUMAtop for OpenSolaris

2010-01-04 Thread johansen
On Thu, Dec 17, 2009 at 05:07:45PM -0800, Jin Yao wrote: > We decide to divide numatop development into 2 phases. > > At phase 1, numatop is designed as a memory locality characterizing tool. > It provides a easy and friendly way for observing performance data from > hardware performance counte

Re: [perf-discuss] NUMAtop for OpenSolaris

2009-12-17 Thread johansen
On Wed, Dec 16, 2009 at 09:17:43PM -0800, Krishnendu Sadhukhan wrote: > I'd like to request sponsorship from the performance community to host a > project for NUMAtop for OpenSolaris. > > NUMAtop is a tool developed by Intel to help developers identify > memory locality in NUMA systems. It's a top

Re: [perf-discuss] sluggish opensolaris-b127 with unknown fault

2009-11-18 Thread johansen
Paul: I'm interested in getting a box like this. I hope this problem has a simple fix. > When I create a 8x mirrored vdev pool (1T samsung enterise drives) and > do a simple dd test it maxes out at 100M/s, where I'd normally expect > 500M/s+ at least. Will you include the output from the follow

Re: [perf-discuss] madvise() and "heap" memory

2009-07-09 Thread johansen
On Thu, Jul 09, 2009 at 12:18:17PM -0500, Bob Friesenhahn wrote: > Do madvise() options like MADV_ACCESS_LWP and MADV_ACCESS_MANY work on > memory allocated via malloc()? If MADV_ACCESS_LWP is specified and > malloc() hands out heap memory which has been used before (e.g. by some > other LWP)

Re: [perf-discuss] perf-discuss Digest, Vol 48, Issue 17

2009-06-30 Thread johansen
David Collier-Brown wrote: > Actually Sean's asking for a clue about the *virtual* memory behavior. The pagelist and the cachelist contain page objects, which have an associated physical page. He's actually talking about physical memory, since you'll never have more pages on any of these lists th

Re: [perf-discuss] Can we get the free memory number right again? (w/ ZFS)

2009-06-25 Thread johansen
On Thu, Jun 25, 2009 at 04:55:14PM -0400, David Collier-Brown wrote: > johan...@sun.com replied > > I disagree with the premise that the free memory calculation isn't > > useful because it doesn't count ZFS caches as free space. > [big snip] > > The space that's consumed by ZFS is in use and can't

Re: [perf-discuss] Can we get the free memory number right again? (w/ ZFS)

2009-06-25 Thread johansen
On Thu, Jun 25, 2009 at 12:18:04PM -0700, Sean Liu wrote: > Correct me if I am wrong. If the applications request memory from O/S, > the kernel will cut back ZFS cache - so in this sense the cache is > free, right? Nope. It doesn't work this way. ZFS has a thread that runs periodically, and is i

Re: [perf-discuss] Can we get the free memory number right again? (w/ ZFS)

2009-06-25 Thread johansen
On Thu, Jun 25, 2009 at 11:16:50AM -0700, Sean Liu wrote: > Now that ZFS cache will eat up all the free mem for cache, again > that's all fine. But vmstat doesn't report meaningful free memory > again. Yes memstat can tell you the ZFS file data size but who wants > to run mdb every now and then? >

Re: [perf-discuss] V890 with US-IV+ : which core should interrupts be delivered to?

2009-06-19 Thread johansen
On Fri, Jun 19, 2009 at 05:52:02PM +0200, Nils Goroll wrote: > I am trying to reduce context switches on a V890 with 16 cores by > delegating one core to interrupt handling (by setting all others to > nointr using psradm). The best way to do this is to create processor sets, with the set that ha

Re: [perf-discuss] question about creation of a new segment (virtual memory)

2009-05-19 Thread johansen
On Mon, May 18, 2009 at 08:01:02PM +0800, Qihua Wu wrote: > My question is: > 1: is the time searching for a free space of 10M counted as sys cpu or user > cpu? The time that the VM spends looking for free space would be considered SYS. > 2: What's the algorithm solaris used to find the free spac

Re: [perf-discuss] -lumem slower than -lmalloc and -lmtmalloc?

2009-05-12 Thread johansen
On Wed, May 13, 2009 at 03:28:11AM +0200, Roland Mainz wrote: > Bob Friesenhahn wrote: > > On Tue, 12 May 2009, Roland Mainz wrote: > > > Can you check whether the memory allocator in libast performs better in > > > this case (e.g. compile with $ cc -I/usr/include/ast/ -last ... # (note: > > > liba

Re: [perf-discuss] -lumem slower than -lmalloc and -lmtmalloc?

2009-05-12 Thread johansen
On Wed, May 13, 2009 at 03:28:19AM +0200, Roland Mainz wrote: > johan...@sun.com wrote: > > On Tue, May 12, 2009 at 01:33:15AM +0200, Roland Mainz wrote: > > > Can you check whether the memory allocator in libast performs better in > > > this case (e.g. compile with $ cc -I/usr/include/ast/ -last .

Re: [perf-discuss] Swap usage and reporting

2009-05-12 Thread johansen
Hi Ethan, On Tue, May 12, 2009 at 03:25:52PM -0700, Ethan Erchinger wrote: > Results, everything looks reasonable to me. > http://pastebin.com/m785c155e Thanks for running this. MySQL is using 44g of memory, which is pretty close to the 48gb limit for the machine. I'm sort of curious about how

Re: [perf-discuss] Swap usage and reporting

2009-05-12 Thread johansen
On Tue, May 12, 2009 at 09:41:41AM -0700, Ethan Erchinger wrote: > Hi all, > > I'm having trouble determining what is using a large amount of swap on a > few of our OpenSolaris systems. These systems run MySQL, the 5.0.65 > version that came with snv_101, have 48G of ram, and 24G of swap. The >

Re: [perf-discuss] -lumem slower than -lmalloc and -lmtmalloc?

2009-05-11 Thread johansen
On Mon, May 11, 2009 at 08:35:40PM -0500, Bob Friesenhahn wrote: >>> Yes. I don't know what libjpeg itself does, but GraphicsMagick should >>> be performing a similar number of allocations (maybe 1000 small >>> allocations) regardless of the size of the JPEG file. >> >> There are some known issues

Re: [perf-discuss] -lumem slower than -lmalloc and -lmtmalloc?

2009-05-11 Thread johansen
On Mon, May 11, 2009 at 06:42:55PM -0500, Bob Friesenhahn wrote: > On Mon, 11 May 2009, johan...@sun.com wrote: >> >> I'm not entirely convinced that this is simply the difference between >> memory mapped allocations versus sbrk allocations. If you compare the >> numbers between malloc and umem, n

Re: [perf-discuss] -lumem slower than -lmalloc and -lmtmalloc?

2009-05-11 Thread johansen
On Tue, May 12, 2009 at 01:33:15AM +0200, Roland Mainz wrote: > Can you check whether the memory allocator in libast performs better in > this case (e.g. compile with $ cc -I/usr/include/ast/ -last ... # (note: > libast uses a |_ast_|-prefix for all symbols and does (currently) not > act as |malloc

Re: [perf-discuss] -lumem slower than -lmalloc and -lmtmalloc?

2009-05-11 Thread johansen
On Mon, May 11, 2009 at 11:10:37AM -0500, Bob Friesenhahn wrote: > It seems that the performance issue stems from libumem using memory > mapped allocations rather than sbrk allocations. I have not seen a > performance impact from using libumem in any other part of the software. > The perform

Re: [perf-discuss] Project Proposal: x86 APIC Scalability

2009-05-06 Thread johansen
Michelle, > Description > > x86 APIC scalability project will modify low-level > Solaris interrupt handling mechanism to support > 256 * #CPUs on a multi-processor system. The project > will also support multiple interrupt priority level on > the same IRQ

Re: [perf-discuss] zfs filebench: cached workload versus uncached workload

2009-04-28 Thread johansen
On Tue, Apr 28, 2009 at 12:58:25PM -0700, Andrew Wilson wrote: > I believe that Solaris always converts reads and writes into file mapped > I/O, so if the cached attribute is missing or set to "false" the VM pages > involved in that will be flushed, so you will see a small difference in > perfor

Re: [perf-discuss] ZFS SMI vs EFI performance using filebench

2009-04-23 Thread johansen
On Thu, Apr 23, 2009 at 04:29:39PM -0700, Dearl D. Neal wrote: > I appreciate the response.. it's the best I have received so far. > From further testing today, I am thoroughly confused. This morning I > was seeing par results with ufs an zfs for the fileserver workload. > The db workloads were abo

Re: [perf-discuss] ZFS SMI vs EFI performance using filebench

2009-04-23 Thread johansen
On Thu, Apr 23, 2009 at 06:07:06AM -0700, Dearl Neal wrote: > I have been testing the performance of zfs vs. ufs using filebench. > The setup is a v240, 4GB RAM, 2...@1503mhz, 1 320GB _SAN_ attached LUN, > and using a ZFS mirrored root disk. Our SAN is a top notch NVRAM > based SAN. Can you give

Re: [perf-discuss] ZFS performance issue - READ is slow as hell...

2009-04-13 Thread johansen
On Sun, Apr 12, 2009 at 11:23:37AM -0400, Jim Mauro wrote: > Thank you Roland. I will try and get an nv110 build in-house > and reproduce this. Your dd test after reboot is a single threaded > sequential read, so I still don't get how disabling prefetch yields > a 15X bandwidth increase. > > I appr

Re: [perf-discuss] Calculating segmap cache hits and misses

2009-03-12 Thread johansen
> Well, 4270983830 = 0xFE920A96 -- curiously close to the 32-bit > wraparound at 0x, eh? It looks like its cousin's value has > already wrapped. These sure look like 32-bit unsigned counters! These counters are defined as KSTAT_DATA_ULONG, which means that they're only 32-bits on a 32-b

Re: [perf-discuss] Calculating segmap cache hits and misses

2009-03-10 Thread johansen
> I'm monitoring some very busy NFS clients and would like to graph the > segmap cache hits and misses. I took a look inside the code of the > segmapstat script included in Mr. Gregg's CacheKit collection, and I'm > getting some negative misses values in the results. I'm not sure I understan

Re: [perf-discuss] The rm_assize() problem

2009-02-26 Thread johansen
Hi Chad, > BTW, I've posted an updated webrev here: > http://cr.opensolaris.org/~cmynhier/6801244.1/. I wanted to do a > nightly build and some testing before posting it. This looks better. Thanks for making the changes we discussed before. > > The only options I see for doing this are either

Re: [perf-discuss] libmicro with Sun Studio 12

2009-02-24 Thread johansen
> Hi, does anybody care about libmicro any more? It needs some help > to work with Studio 12. > > make extra_CFLAGS=-features=no%conststrings > > The libmicro site said that this was where to discuss it, but I don't > see any discussion about it, so if there is somewhere else, please > let me kn

Re: [perf-discuss] The rm_assize() problem

2009-02-24 Thread johansen
Hi Chad, You probably want a vm expert to take a look at this code, but I'm happy to provide comments anyway. vm/as.h: - line 114: You probably want this member at the end of the structure. If we end up backporting the fix, we don't want to displace the pre-existing offsets in a struct

Re: [perf-discuss] Expiring Core Contributor Grants

2009-02-12 Thread johansen
Folks, The voting for core contributor grants has been open for a week. At this point we've received enough votes to proceed. Here are the renewals: akolb Core Contributor+1: esaxe, jjc, johansen, mpogue barts Core Contributor+1: esaxe, jjc, johansen, m

Re: [perf-discuss] Code review of FileBench shutdown bug fixes requested

2009-02-10 Thread johansen
Hi Drew, This fix looks fine to me. Just so I understand your shutdown model, you send SIGUSR1 to the processes in the procflow after you've handled the SIGINT/SIGTERM from kill/ ^C, right? Thanks, -j On Tue, Feb 10, 2009 at 10:20:25AM -0800, Andrew Wilson wrote: > Dear OpenSolaris performance

Re: [perf-discuss] Contributer grant nomination for Chad Mynheir

2009-02-06 Thread johansen
I'm also in favor of such a nomination (+1) -j On Fri, Feb 06, 2009 at 10:23:40AM -0500, Jim Mauro wrote: > A big +1 for Chad > > Eric Saxe wrote: > > I'd like to put forth a contributer grant nomination for Chad Mynhier, > > for his recent contributions in the area of performance observability

[perf-discuss] Expiring Core Contributor Grants

2009-02-05 Thread johansen
ontributor2009-02-24 esaxe Eric C. SaxeCore Contributor2009-02-24 johansen johansenCore Contributor2009-02-24 jjc Jonathan J. ChewCore Contributor2009-02-24 mpogue Mike Pogue Core C

Re: [perf-discuss] x86 memory cache control in OpenSolaris?

2009-02-04 Thread johansen
The PSARC case you've found is for the DDI. See ddi_dma_mem_alloc(9F) for details about how a device may describe the cache attributes for allocated memory. As far as I know, there isn't a similar interface for application memory. The memcntl(2) interface allows you to set permissions on memory

Re: [perf-discuss] Performance of 32-bit vs 64-bit benchmark

2009-01-30 Thread johansen
> > I'm not sure that I follow your argument. The T1000's architecture > > favors workloads that have many parallel tasks that involve data > > throughput. The Xenon is going to have a better showing for straight > > number-crunching work. If your webserver benchmark is trying to measure > > the

Re: [perf-discuss] Performance of 32-bit vs 64-bit benchmark

2009-01-30 Thread johansen
On Fri, Jan 30, 2009 at 04:14:50PM -0500, Elad Lahav wrote: > > Ahh.. is there something specific that leads you to believe it's > > related to integer performance? > > Definitely not. I just wanted to get some basic notion of the relative > "power" of the machine in a benchmark that was supposed t

Re: [perf-discuss] FileBench Code review requested

2009-01-20 Thread johansen
Hi Drew, > The goal is to make it easy for other file system clients, such as > nfsv3, to be added to filebench. The will supply their own routines to > implement the I/O functions, as well as custom asynchronous flowops. > > See it all at: > > http://cr.opensolaris.org/~dreww/localfs_plugin/

Re: [perf-discuss] how to find out kernel memory allocated by what threads?

2009-01-06 Thread johansen
> But I wonder can this be the problem? (claiming 55% of memory in > kmem_alloc_xxx buffers, which totals at about 16GB! ) Didn't those > "simple" exec (ps, perl, grep, prtdiag etc.) free the buffers/memory > when they exit? Many of these allocations probably were freed. Remember, this script i

Re: [perf-discuss] how to find out kernel memory allocated by what threads?

2009-01-05 Thread johansen
> If you turn on kernel memory auditing with the kmem_flags variable, > you can use the ::kmausers dcmd in mdb to see which kernel stacks > resulted in the most memory allocations from each kmem cache. To > enable auditing, add 'set kmem_flags=0x1' to /etc/system and reboot. > > > We have a V490

Re: [perf-discuss] Q: Rehome thread to new lgroup?

2008-12-10 Thread johansen
> I tried, but for some threads, I got surprising results: > > > ./plgrp -a all 27215 > PID/LWPIDHOME AFFINITY >27215/12 5/strong,0-4,6-24/none > > If the home lgroup is defined as the lgroup with the strongest > affinity, isn't the output above somewhat contradictory? >

Re: [perf-discuss] Code review request for FileBench code to implement random file access within fiilesets

2008-12-03 Thread johansen
Hi Drew, These changes look fine to me. I'm impressed that you went to the trouble to write your own AVL tree implementation; however, I suppose it was required for portability. Solaris has one on the default install, in /usr/include/sys/avl.h and /usr/lib/libavl.so.1. -j On Mon, Dec 01, 2008

Re: [perf-discuss] fork failure - vmem stats

2008-11-14 Thread johansen
No, as far as I can tell, the kmem allocs in fork use KMEM_SLEEP. This means that they're guaranteed to return memory, and will sleep until some becomes available. EAGAIN covers a lot of different errors in fork. EAGAINA resource control or limit on the total number of

Re: [perf-discuss] Can anyone help with posix_fadvise on ZFS?

2008-11-13 Thread johansen
In general, [EMAIL PROTECTED] is the place to ask questions about ZFS issues. There's a manual page for posix_fadvise(3C), but it's up to the operating system to decide what to do with this kind of advice. If you look at the source for posix_fadvise(3C), you'll see that this doesn't result in any

Re: [perf-discuss] fb_onnv_videoserv-hgchild

2008-10-22 Thread johansen
Hi Drew, Sorry it has taken me so long to get to this review. The comments below are for the latest webrev you sent me in e-mail, not the one on cr.opensolaris.org. In general this code looks fine. I just had a few nits, listed below. - fileset.c: - line 278: Since you're performing a strcpy

Re: [perf-discuss] Using Kstat framework

2008-09-25 Thread johansen
> ? Do we have option to reset the kernel statistics in our program? No, but this isn't necessary. > ? When I run the command "kstat sctp 10" on my Solaris system, it > provides me the statistics in every 10 seconds. So can I have the same > option when using the kstat framework in my implementat

Re: [perf-discuss] CMT and NUMA proposals

2008-08-15 Thread johansen
A plus one to both. I realize it may not be feasible now, but long term it would be great if we could move the sun-internal e-mail lists that exist around these projects into the open. If we get enough projects using perf-discuss as a default mailing list it's going to be confusing for everyone.

[perf-discuss] CMT and NUMA proposals

2008-08-15 Thread johansen
Hi Jonathan, I'm in favor of both of these proposals. However, I think they're incomplete, at least according to the project instantiation guidelines. Would you amend these proposals to include the participants for each project, information about each project's mailing list, and the consolidation

Re: [perf-discuss] find swapping prozess PID

2008-07-01 Thread johansen
> It looks strange, but for the operating system it isn't - those threads > (or LWPs, as the man page reads) are waiting for an event or signal, at > which point they will swap back in. IIRC, there was a bug about this a while ago. Essentially, the FX and RT scheduling classes didn't implement CL

Re: [perf-discuss] find swapping prozess PID

2008-07-01 Thread johansen
> Has anybody an Idea how I could identify the PID of the swapping prozesses? I'm not sure why there isn't a good way of doing this. Perhaps I've missed a more obvious approach. I would do this with mdb. As a priviliged user, do the following: # mdb -k > ::walk proc pp | ::print proc_t p_swapc

Re: [perf-discuss] [tools-compilers] Application runs almost 2xslower on Nevada than Linux

2008-06-24 Thread johansen
> Could you try an experiment and compile you sources with > /usr/lib/libast.so.1 (you need to compile the sources with > -I/usr/include/ast before /usr/include/ since libast uses a different > symbol namespace and cannot be used to "intercept" other > |malloc()|/|free()| calls like libbsdmalloc) ?

Re: [perf-discuss] Performance issue

2008-06-11 Thread johansen
in a blog entry Neel wrote: http://blogs.sun.com/realneel/entry/zfs_and_databases This config seems pretty reasonable, but I'd double-check with the experts on zfs-discuss. -j On Wed, Jun 11, 2008 at 09:22:35AM -0700, Grant Lowe wrote: > Hi Johansen, > > As you requeste

Re: [perf-discuss] Performance issue

2008-06-11 Thread johansen
Some of this is going to echo Bob's thoughts and suggestions. > > Sun E4500 with Solaris 10, 08/07 release. SAN attached through a > > Brocade switch to EMC CX700. There is one LUN per file system. > > What do you mean by "one LUN per file system"? Do you mean that the > entire pool is mappe

Re: [perf-discuss] [storage-discuss] vdbench availability to the general community ?

2008-06-05 Thread johansen
What do these numbers mean? One reason I prefer the output from filebench is that it's self-descriptive. I'd be hard pressed to figure out what this means unless I was very familiar with the vdbench workload and its output. -j > *** VDBENCH TESTING > > array-1 > [...] > 10:06:59.0535

Re: [perf-discuss] Application runs almost 2x slower on Nevada than Linux

2008-05-01 Thread johansen
Yeah, I did some digging when I had a free moment. The following is the most germane to your issue. 5070823 poor malloc() performance for small byte sizes -j On Thu, May 01, 2008 at 05:36:26PM -0400, Matty wrote: > We are building our application as a 32-bit entity on both Linux and > S

Re: [perf-discuss] Application runs almost 2x slower on Nevada than Linux

2008-05-01 Thread johansen
Part of the problem is that these allocations are very small: # dtrace -n 'pid$target::malloc:entry { @a["allocsz"] = quantize(arg0); }' -c /tmp/xml allocsz value - Distribution - count 1 |

Re: [perf-discuss] Application runs almost 2x slower on Nevada than Linux

2008-04-28 Thread johansen
Hey Dude, I pulled down a copy of your test program and ran a few experiments. $ time ./xml 10 iter in 22.715982 sec real0m22.721s user0m22.694s sys 0m0.007s This seems to indicate that all of our time is being spent in usermode, so whatever it is in Solaris that is slower than L

Re: [perf-discuss] IO problem on production server

2008-03-26 Thread johansen
> If they cannot be optimized you can get about a 12% performance bump > using UFS + directio > > http://blogs.sun.com/realneel/ Go re-read that blog entry: If you do not penalize ZFS with double checksums, you can note that we are within 6% of our best UFS number. So 6% gives

Re: [perf-discuss] What does "/0" mean in thread name/id column in prstat -mL output ?

2008-02-14 Thread johansen
> To capture the resources used by short lived processes, you need to > use accounting You can use accounting, but you don't need to. It's also possible to do this with DTrace. DTrace has the benefit that it can be enabled only when needed, and the scripts allow it to be easily customized. Bren

Re: [perf-discuss] Code Review for new FileBench feature requested

2008-01-29 Thread johansen
> Hmmm, I guess I could allow individual parameters to be accessed with > $., so a couple lines like: > > usage " set \$iosize.min=defaults to $iosize.min" > usage " set \$iosize.gamma= defaults to $iosize.gamma" > > which would print out as: > set $iosize.min= defaults to 1

Re: [perf-discuss] Code Review for new FileBench feature requested

2008-01-29 Thread johansen
Hi Drew, I took a look at your webrev and in general I think this looks good. I only have a couple of nits: - Does it make sense to coalese the gamma code and PRNG code into a single module? - I've never done anything with Lex/Yacc, so I didn't review that code - In the two .f files you added

Re: [perf-discuss] ZFS write time performance question

2007-11-29 Thread johansen
> Test Case: Run 'iostat', then write a 1GB file using 'mkfile 1g > testfile' and then run iostat again. It may also be interesting to look at zpool iostat, which should show the I/O to the pool, from ZFS's perspective. I'd also be curious to know how your zpool was configured. > ZFS Test Result

Re: [perf-discuss] running my EMR on solaris

2007-11-19 Thread johansen
Hi Greg: Unless you have a canned set of benchmarks that we can run on a couple of different machines, it's hard to make generalizations about the performance that you're going to see for your specific workload. That said, there's a risk-free way of trying the CoolThreads servers. Sun will send y

Re: [perf-discuss] Psyscall crashes controlled process.

2007-10-26 Thread johansen
This is off-topic for performance-discuss. You might consider posing this question to the Observability Community, they're the maintainers of libproc. This is their webpage: http://www.opensolaris.org/os/community/observability/ Mailing list info is here: http://mail.opensolaris.org/mailman/li

Re: [perf-discuss] transactional memory + power efficiency

2007-10-25 Thread johansen
Prof Mark D. Hill had a paper published in ISCA this year titled Performance Pathologies in Hardware Transactional Memory. It's available here: http://www.cs.wisc.edu/multifacet/papers/isca07_pathologies.pdf It's a worthwhile read. -j On Thu, Oct 25, 2007 at 04:41:48PM -0300, Rafael Vanoni wr

Re: [perf-discuss] prefetch data for page fault

2007-09-24 Thread johansen
Agreed. At page fault time, we still have no idea whether the access was random or sequential. I suspect that there would only be a benefit to prefetching if we could detect that we're going access the next N cache-lines sequentially. Even that seems like it could be more trouble than it's worth

Re: [perf-discuss] prefetch data for page fault

2007-09-24 Thread johansen
>My means is not prefetch the page but use prefetch instruction > >(e.g. __asm__ __volatile__( " prefetchnta %0" : : "m" (*addr) ) ;// >addr is fist parameter of pagefault handler, which caused page fault >trap), prefetch the content in the addr into cache line. Let me reite

Re: [perf-discuss] prefetch data for page fault

2007-09-21 Thread johansen
>"The user program would have trouble >doing the prefetch, >because it is often difficult to predict >where page faults will happen >and because a prefetch before >the page fault will probably have no effect. >" > >The prefetch instruciton will execute in kernel

Re: [perf-discuss] Poor swap performance

2007-08-22 Thread johansen-osdev
> While investigating this, we came up with a test scenario that > consistantly reproduced this behavior. The behavior is that if you > have a system with 4gb of memory, and create a 1gb file in /tmp, and a > 1gb file in /var/tmp, and then you start 2 processes each with an rss > of about 1gb, your

Re: [perf-discuss] Poor swap performance

2007-08-21 Thread johansen-osdev
Peter: Would you describe your swap configuration? The output from df -hlk and swap -l would be helpful. Thanks, -j On Wed, Aug 15, 2007 at 08:17:05AM -0700, Peter C. Norton wrote: > We use solaris 10 at my company, but I noticed this behavior is the > same/worse on sxde, and I wanted to know i

Re: [perf-discuss] Memory leaks detection

2007-07-06 Thread johansen-osdev
Hi Alex: These leaks in tar aren't enough to consume 1GB of memory. When applications leak memory, that memory will be returned to the system once the application exits. What are you using to measure the amount of memory consumed by your system? Can you explain why you think it is leaking memor

Re: [perf-discuss] Project sponsorship request: Tesla

2007-06-05 Thread johansen-osdev
Eric, You're one of the leaders of the performance community, so don't forget to vote for yourself. I think this would be great for the performance community. I'm in favor of this project. (+1) -j On Tue, Jun 05, 2007 at 01:28:50PM -0700, Eric Saxe wrote: > > I'd like the ask the OpenSolaris p

Re: [perf-discuss] RFE: "bt" (=batch) scheduler class...

2007-06-04 Thread johansen-osdev
> My idea was to provide a simple, "preconfigured" tool to do this. Without writing any new code, it seems that we already have a tool for doing this. The manpage for priocntl(1) is pretty explicit: In addition to the system-wide limits on user priority (displayed with priocnt

Re: [perf-discuss] RFE: "bt" (=batch) scheduler class...

2007-05-30 Thread johansen-osdev
I'm not sure that I understand why we need to introduce new code for this kind of functionality. Why not use the FX class and assign your batch processes priority 0 and a longer time quantum? priocntl(1) explains the details about how one might accomplish this. -j On Wed, May 30, 2007 at 11:57:

Re: [perf-discuss] Weird load spikes

2007-05-21 Thread johansen-osdev
Another place to start might be with Brendan Gregg's DTrace tools: http://www.brendangregg.com/dtrace.html His prustat, hotuser, hotkernel, and shortlived.d scripts might be helpful in your situation. -j On Mon, May 21, 2007 at 01:50:27PM -0700, Eric Saxe wrote: > Jeffrey Collyer wrote:

Re: [perf-discuss] Re: A good indicator of I/O problem

2007-05-21 Thread johansen-osdev
> Be careful with %w... it's not that accurate. If you upgrade your > e20k to Solaris 10, you'll lose that as iowait is no longer > calculated (although the %w column is still there for output > compatibility reasons. %b (% busy) is what you should be looking at > instead. That's not entir

Re: [perf-discuss] Impressed with express edition but..

2007-04-20 Thread johansen-osdev
You might consider posting this question to install-discuss. Other people have also complained about the RAM requirement. I don't beleive the installer group monitors the performance list. -j On Fri, Apr 20, 2007 at 06:25:16AM -0700, Joseph villa wrote: > I just got my cd for the express editio

Re: [perf-discuss] Request for guidance on shmat() call attaching at different addre ss

2007-04-04 Thread johansen-osdev
Ganesh: > Next when the client processes comes up, they attach to all these shms > created by server process. When the client process calls "shmat" on > the first 3 memory segments, shmat returns the same address which it > returned to server process (these addresses are stored in 1st shm). > But

Re: [perf-discuss] Project endorsement request: Enhance Solaris for Intel Project

2007-03-22 Thread johansen-osdev
I second Bart's endorsement, if it matters. -j On Thu, Mar 22, 2007 at 12:59:14PM -0700, Bart Smaalders wrote: > Eric Saxe wrote: > >I'd like to request endorsement from the performance community for the > >"Enable/enhance Solaris support for Intel Project". > >http://www.opensolaris.org/os/proj

Re: [perf-discuss] Re: Re: prstat-m - cpu% does not add to 100% ?

2007-02-13 Thread johansen-osdev
> Thanks! As long as I can trust (more or less) the numbers for the > USR/SYS and avoid the dampening effect of the regular prstat - I am > happy. I'm not sure I understand your comments about regular prstat. Would you be kind enough to describe the dampening effect you've mentioned? I'd be inte

Re: [perf-discuss] Re: Re: prstat-m - cpu% does not add to 100% ?

2007-02-12 Thread johansen-osdev
ID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/NLWP 3067 johansen 1.0 0.2 0.0 0.0 0.0 33 65 0.1 1K 123 10K 0 firefox-bin/3 1723 johansen 0.6 0.3 0.0 0.0 0.0 0.0 99 0.2 1K 10 5K 141 Xorg/1 3301 johansen 0.0 0.1 0.0 0.0 0.0 0.0 100 0.0 45 0 404 0 prstat/1 24

Re: [perf-discuss] Re: Re: prstat-m - cpu% does not add to 100% ?

2007-02-12 Thread johansen-osdev
> OK, I think I got it, please correct if I am wrong: > > Suppose I do [b]prstat -m 10[/b], suppose a given lwp is in state X at > the beginning and transitions to state Y 4 seconds after the beginning > of the monitoring sample and back to X 1 sec later. So, in this scenario the time that your

Re: [perf-discuss] Re: prstat-m - cpu% does not add to 100% ?

2007-02-11 Thread johansen-osdev
> This makes sense, however, I am still a bit confused. You are stating > that in Solaris 9 microstate accounting gets only updated when the lwp > is transitions from one state to another. No, microstate data only gets updated when the lwp transitions from one state to another. This hasn't change

Re: [perf-discuss] prstat-m - cpu% does not add to 100% ?

2007-02-10 Thread johansen-osdev
Eugene: I think I understand why your microstate values aren't adding up to 100%. The fact that you're running Solaris 9 has a lot to do with the problem. Microstate accounting only updates its timestamps when a lwp transitions from one state to another. So, in your case, your lwp has been idle

Re: [perf-discuss] prstat-m - cpu% does not add to 100% ?

2007-02-10 Thread johansen-osdev
What version of Solaris are you running? Any additional details you could provide about your configuration and software would be helpful. -j On Fri, Feb 09, 2007 at 01:05:39PM -0800, Eugene Margulis wrote: > I understand that prstat -m shows microstate wallclock utilization that > should add up

Re: [perf-discuss] Re: Re: kstat / intrstat

2007-01-05 Thread johansen-osdev
> So let's say my CPUs are 1200MHz, does it make each increment > 1/1,200,000 second? 1 MHz is 1,000,000 Hz. So if your CPU is 1200 MHz, that's 1,200,000,000 Hz. With this in mind, the increment would actually be 1/1,200,000,000 of a second. > And if I have 600,000 interrupt 6 increments in one

Re: [perf-discuss] Re: kstat / intrstat

2007-01-05 Thread johansen-osdev
> What exactly does "count of CPU cycles" mean here? CPU time? in nano > seconds? number of interrupts? Wikipedia has a relatively concise explanation of CPU clock rate here: http://en.wikipedia.org/wiki/CPU_clock In intrstat's case, the CPU cycles are read from an on-chip register that advances

Re: [perf-discuss] kstat / intrstat

2007-01-04 Thread johansen-osdev
These numbers for level-XX are a count of CPU cycles spent in each interrupt level. -j On Thu, Jan 04, 2007 at 12:05:19PM -0800, Sean Liu wrote: > This question is for Solaris 9 - I understand only Solaris 10 has intrstat, > but there is also intrstat provider for kstat in solaris 9: > #kstat -n

Re: [perf-discuss] Read lock contention, solaris 9, 32 cores host.

2006-12-14 Thread johansen-osdev
Konstantin: > This is single static RW lock which protects array of pointers to > data structures. this array slowly growing to size depended > of specific installation. growing with pretty big size increase at once. > say, on W this RW is locked once in a hour. a lot of threads, > which consume s

  1   2   >