Hey Dude, I pulled down a copy of your test program and ran a few experiments.
$ time ./xml 100000 iter in 22.715982 sec real 0m22.721s user 0m22.694s sys 0m0.007s This seems to indicate that all of our time is being spent in usermode, so whatever it is in Solaris that is slower than Linux, it's not the operating system. Just to double check, I used the following dtrace to look at the time this application is spending in syscalls: # dtrace -n 'syscall:::entry /execname == "xml"/ { self->traced = timestamp; self->func = probefunc;} syscall:::return /self->traced/ { @a[self->func] = sum(timestamp - self->traced); @b[self->func] = count(); self->func = (char *)0; self->timestamp = 0;}' (This returns the time xml spends in syscalls as well as the number of syscalls made) Time (nanoseconds): getpid 1978 sysconfig 2229 sigpending 3071 sysi86 3529 getrlimit 4234 setcontext 5763 fstat64 7042 close 16606 write 19898 getcwd 21302 ioctl 23668 munmap 25757 brk 28445 open 40712 resolvepath 51777 xstat 57616 mmap 159275 memcntl 267070 Number of invocations: fstat64 1 getcwd 1 getpid 1 getrlimit 1 ioctl 1 sigpending 1 sysconfig 1 sysi86 1 write 1 setcontext 2 memcntl 6 close 7 munmap 8 open 8 resolvepath 9 xstat 9 brk 10 mmap 32 This agrees with the output from time. You're spending a miniscule amount of system time performing memory operations. The next place to look would be at libxml2. What library versions do you have installed on the different machines. If the library versions don't match, or you've compiled with a different set of options, or used a different compiler to build the library, you'll get different results for the benchmark. I ran my test on snv_84. It has version 2.6.23 installed. Another approach would be to use DTrace's profile provider. I obtained usable results by performing the following: # dtrace -n 'profile-1001 /execname == "xml"/ [EMAIL PROTECTED]()] = count();} END { trunc(@a, 20);}' -c /tmp/xml This returns the 20 most frequent stack frames (and the number of times the occurred) when DTrace runs a profile probe at 1001hz. xmlDictLookup shows up as a popular function, but that may be a red herring. I'll include the top 5 here, but you can run this command, or adjust the truncation to view any amount you'd like: libc.so.1`_free_unlocked+0x4c libc.so.1`free+0x2b libxml2.so.2`xmlFreeNodeList+0x19e libxml2.so.2`xmlFreeNodeList+0xb6 libxml2.so.2`xmlFreeNodeList+0xb6 libxml2.so.2`xmlFreeDoc+0xac xml`main+0x55 xml`_start+0x80 61 libxml2.so.2`xmlDictLookup+0x60 libxml2.so.2`xmlParseNCName+0xab libxml2.so.2`xmlParseQName+0x44 libxml2.so.2`xmlParseAttribute2+0x54 libxml2.so.2`xmlParseStartTag2+0x246 libxml2.so.2`xmlParseElement+0x99 libxml2.so.2`xmlParseContent+0x12c libxml2.so.2`xmlParseElement+0x134 libxml2.so.2`xmlParseContent+0x12c libxml2.so.2`xmlParseElement+0x134 libxml2.so.2`xmlParseDocument+0x33b libxml2.so.2`xmlDoRead+0x6b libxml2.so.2`xmlReadMemory+0x36 xml`main+0x4c xml`_start+0x80 66 libxml2.so.2`xmlParseStartTag2+0xb7e libxml2.so.2`xmlParseElement+0x99 libxml2.so.2`xmlParseContent+0x12c libxml2.so.2`xmlParseElement+0x134 libxml2.so.2`xmlParseContent+0x12c libxml2.so.2`xmlParseElement+0x134 libxml2.so.2`xmlParseDocument+0x33b libxml2.so.2`xmlDoRead+0x6b libxml2.so.2`xmlReadMemory+0x36 xml`main+0x4c xml`_start+0x80 66 libc.so.1`_free_unlocked+0x53 libc.so.1`free+0x2b libxml2.so.2`xmlFreeNodeList+0x19e libxml2.so.2`xmlFreeNodeList+0xb6 libxml2.so.2`xmlFreeNodeList+0xb6 libxml2.so.2`xmlFreeDoc+0xac xml`main+0x55 xml`_start+0x80 72 libxml2.so.2`xmlDictLookup+0x60 libxml2.so.2`xmlParseNCName+0xab libxml2.so.2`xmlParseQName+0x44 libxml2.so.2`xmlParseStartTag2+0x14c libxml2.so.2`xmlParseElement+0x99 libxml2.so.2`xmlParseContent+0x12c libxml2.so.2`xmlParseElement+0x134 libxml2.so.2`xmlParseContent+0x12c libxml2.so.2`xmlParseElement+0x134 libxml2.so.2`xmlParseDocument+0x33b libxml2.so.2`xmlDoRead+0x6b libxml2.so.2`xmlReadMemory+0x36 xml`main+0x4c xml`_start+0x80 80 Hope this helps. It may be that you've got a newer / more optimized version of libxml2 on your Linux box. -j On Mon, Apr 28, 2008 at 04:21:19PM -0400, Matty wrote: > Howdy, > > I have been working with one of our developers to port a Linux application to > opensolaris. While benchmarking the app, we noticed that it ran 2x slower on > a Nevada build 85 host than it did on Linux. The application utilizes libxml > to > transform XML documents, and I think we have narrowed down the discrepancy > to the xmlReadMemory libxml function. We created a test program to call > xmlReadMemory100k times, and measured the program execution time. Here > are the results we got on the same hardware (Sun X2200): > > CentOS Linux 5: > > % ./xml > 100000 iter in 9.581637 sec > > Nevada build 85: > > % ./xml > 100000 iter in 15.983286 sec > > When we ran the collect / er_print tools to generate execution > profiles, we noticed that > the top functions make heavy use of memory. Based on a number of documents I > read on the Sun developer site, I tried different compiler flags, > memory allocators > (mtmalloc, libumem) and compilers (Sun studio 12). These items didn't > help, and we > are currently a bit mystified. The test program our developer wrote is > available here: > > http://prefetch.net/xml.c > > And the following command line can be used to compile it; > > $ gcc -O3 -o xml `/usr/bin/xml2-config --libs --cflags` xml.c > > That said, does anyone have any thoughts on why the same code would be slower > on Nevada? I really want to get our app running on opensolaris, but > it's hard to > justify the port when it when it's not running as well under Linux. :( > > Thanks for any insight, > - Ryan > _______________________________________________ > perf-discuss mailing list > perf-discuss@opensolaris.org _______________________________________________ perf-discuss mailing list perf-discuss@opensolaris.org