On Mon, Apr 28, 2008 at 5:01 PM, <[EMAIL PROTECTED]> wrote: > Hey Dude, > I pulled down a copy of your test program and ran a few experiments. > > $ time ./xml > 100000 iter in 22.715982 sec > > real 0m22.721s > user 0m22.694s > sys 0m0.007s > > This seems to indicate that all of our time is being spent in usermode, so > whatever it is in Solaris that is slower than Linux, it's not the operating > system. > > Just to double check, I used the following dtrace to look at the time this > application is spending in syscalls: > > # dtrace -n 'syscall:::entry /execname == "xml"/ { self->traced = timestamp; > self->func = probefunc;} syscall:::return /self->traced/ { @a[self->func] = > sum(timestamp - self->traced); @b[self->func] = count(); self->func = (char > *)0; self->timestamp = 0;}' > > (This returns the time xml spends in syscalls as well as the number of > syscalls > made) > > Time (nanoseconds): > > getpid 1978 > sysconfig 2229 > sigpending 3071 > sysi86 3529 > getrlimit 4234 > setcontext 5763 > fstat64 7042 > close 16606 > write 19898 > getcwd 21302 > ioctl 23668 > munmap 25757 > brk 28445 > open 40712 > resolvepath 51777 > xstat 57616 > mmap 159275 > memcntl 267070 > > Number of invocations: > > fstat64 1 > getcwd 1 > getpid 1 > getrlimit 1 > ioctl 1 > sigpending 1 > sysconfig 1 > sysi86 1 > write 1 > setcontext 2 > memcntl 6 > close 7 > munmap 8 > open 8 > resolvepath 9 > xstat 9 > brk 10 > mmap 32 > > This agrees with the output from time. You're spending a miniscule amount > of system time performing memory operations. > > The next place to look would be at libxml2. What library versions do > you have installed on the different machines. If the library versions > don't match, or you've compiled with a different set of options, or used > a different compiler to build the library, you'll get different results > for the benchmark. > > I ran my test on snv_84. It has version 2.6.23 installed. > > Another approach would be to use DTrace's profile provider. I obtained > usable results by performing the following: > > # dtrace -n 'profile-1001 /execname == "xml"/ [EMAIL PROTECTED]()] = > count();} END { trunc(@a, 20);}' -c /tmp/xml > > This returns the 20 most frequent stack frames (and the number of times > the occurred) when DTrace runs a profile probe at 1001hz. > > xmlDictLookup shows up as a popular function, but that may be a red > herring. I'll include the top 5 here, but you can run this command, or > adjust > the truncation to view any amount you'd like: > > libc.so.1`_free_unlocked+0x4c > libc.so.1`free+0x2b > libxml2.so.2`xmlFreeNodeList+0x19e > libxml2.so.2`xmlFreeNodeList+0xb6 > libxml2.so.2`xmlFreeNodeList+0xb6 > libxml2.so.2`xmlFreeDoc+0xac > xml`main+0x55 > xml`_start+0x80 > 61 > > libxml2.so.2`xmlDictLookup+0x60 > libxml2.so.2`xmlParseNCName+0xab > libxml2.so.2`xmlParseQName+0x44 > libxml2.so.2`xmlParseAttribute2+0x54 > libxml2.so.2`xmlParseStartTag2+0x246 > libxml2.so.2`xmlParseElement+0x99 > libxml2.so.2`xmlParseContent+0x12c > libxml2.so.2`xmlParseElement+0x134 > libxml2.so.2`xmlParseContent+0x12c > libxml2.so.2`xmlParseElement+0x134 > libxml2.so.2`xmlParseDocument+0x33b > libxml2.so.2`xmlDoRead+0x6b > libxml2.so.2`xmlReadMemory+0x36 > xml`main+0x4c > xml`_start+0x80 > 66 > > libxml2.so.2`xmlParseStartTag2+0xb7e > libxml2.so.2`xmlParseElement+0x99 > libxml2.so.2`xmlParseContent+0x12c > libxml2.so.2`xmlParseElement+0x134 > libxml2.so.2`xmlParseContent+0x12c > libxml2.so.2`xmlParseElement+0x134 > libxml2.so.2`xmlParseDocument+0x33b > libxml2.so.2`xmlDoRead+0x6b > libxml2.so.2`xmlReadMemory+0x36 > xml`main+0x4c > xml`_start+0x80 > 66 > > libc.so.1`_free_unlocked+0x53 > libc.so.1`free+0x2b > libxml2.so.2`xmlFreeNodeList+0x19e > libxml2.so.2`xmlFreeNodeList+0xb6 > libxml2.so.2`xmlFreeNodeList+0xb6 > libxml2.so.2`xmlFreeDoc+0xac > xml`main+0x55 > xml`_start+0x80 > 72 > > libxml2.so.2`xmlDictLookup+0x60 > libxml2.so.2`xmlParseNCName+0xab > libxml2.so.2`xmlParseQName+0x44 > libxml2.so.2`xmlParseStartTag2+0x14c > libxml2.so.2`xmlParseElement+0x99 > libxml2.so.2`xmlParseContent+0x12c > libxml2.so.2`xmlParseElement+0x134 > libxml2.so.2`xmlParseContent+0x12c > libxml2.so.2`xmlParseElement+0x134 > libxml2.so.2`xmlParseDocument+0x33b > libxml2.so.2`xmlDoRead+0x6b > libxml2.so.2`xmlReadMemory+0x36 > xml`main+0x4c > xml`_start+0x80 > 80 > > Hope this helps. It may be that you've got a newer / more optimized > version of libxml2 on your Linux box.
Thanks for the reply. When I mentioned memory operations, I probably should have been more specific. The libxml test case I sent makes heavy use of malloc, and according to the Sun Studio collect utility, the calls to malloc are taking up a significant percentage of the overall execution time. Here is a snippet from the collect output: Excl. Incl. Name User CPU User CPU sec. sec. 26.789 26.789 <Total> 2.292 10.497 umem_cache_alloc 1.391 3.062 xmlDictLookup 1.381 1.381 mutex_unlock 1.301 5.084 umem_slab_alloc 1.241 2.742 umem_slab_create 1.091 15.671 xmlParseStartTag2 0.931 0.931 mutex_lock_impl 0.801 1.771 umem_depot_alloc 0.801 3.052 xmlParseNCName 0.771 1.701 mutex_lock 0.751 9.407 xmlSAX2StartElementNs 0.680 11.768 malloc 0.680 25.488 xmlParseElement 0.640 8.506 xmlSAX2TextNode 0.630 6.935 xmlParseCharData 0.590 11.088 umem_alloc 0.580 0.580 memcpy 0.490 0.490 xmlDictComputeKey 0.480 0.480 mutex_trylock 0.420 6.344 xmlSAX2Characters 0.420 0.420 xmlSkipBlankChars 0.380 24.517 xmlParseContent 0.380 0.380 xmlStrEqual 0.360 1.771 xmlNewPropInternal 0.330 0.330 xmlNextChar 0.320 0.330 xmlParseAttValueInternal Has anyone seen this behavior before? Has anyone found any compiler flags or light weight memory allocators that might help? Thanks, - Ryan _______________________________________________ perf-discuss mailing list perf-discuss@opensolaris.org