Hey Dude,
I pulled down a copy of your test program and ran a few experiments.

$ time ./xml
100000 iter in 22.715982 sec

real    0m22.721s
user    0m22.694s
sys     0m0.007s

This seems to indicate that all of our time is being spent in usermode, so
whatever it is in Solaris that is slower than Linux, it's not the operating
system.

Just to double check, I used the following dtrace to look at the time this 
application is spending in syscalls:

# dtrace -n 'syscall:::entry /execname == "xml"/ { self->traced = timestamp; 
self->func = probefunc;}  syscall:::return /self->traced/ { @a[self->func] = 
sum(timestamp - self->traced); @b[self->func] = count(); self->func = (char 
*)0; self->timestamp = 0;}'

(This returns the time xml spends in syscalls as well as the number of syscalls
made)

Time (nanoseconds):

  getpid                                                         1978
  sysconfig                                                      2229
  sigpending                                                     3071
  sysi86                                                         3529
  getrlimit                                                      4234
  setcontext                                                     5763
  fstat64                                                        7042
  close                                                         16606
  write                                                         19898
  getcwd                                                        21302
  ioctl                                                         23668
  munmap                                                        25757
  brk                                                           28445
  open                                                          40712
  resolvepath                                                   51777
  xstat                                                         57616
  mmap                                                         159275
  memcntl                                                      267070

Number of invocations:

  fstat64                                                           1
  getcwd                                                            1
  getpid                                                            1
  getrlimit                                                         1
  ioctl                                                             1
  sigpending                                                        1
  sysconfig                                                         1
  sysi86                                                            1
  write                                                             1
  setcontext                                                        2
  memcntl                                                           6
  close                                                             7
  munmap                                                            8
  open                                                              8
  resolvepath                                                       9
  xstat                                                             9
  brk                                                              10
  mmap                                                             32

This agrees with the output from time.  You're spending a miniscule amount
of system time performing memory operations.

The next place to look would be at libxml2.  What library versions do
you have installed on the different machines.  If the library versions
don't match, or you've compiled with a different set of options, or used
a different compiler to build the library, you'll get different results
for the benchmark.

I ran my test on snv_84.  It has version 2.6.23 installed.

Another approach would be to use DTrace's profile provider.  I obtained
usable results by performing the following:

# dtrace -n 'profile-1001 /execname == "xml"/ [EMAIL PROTECTED]()] = count();} 
END { trunc(@a, 20);}' -c /tmp/xml

This returns the 20 most frequent stack frames (and the number of times
the occurred) when DTrace runs a profile probe at 1001hz.

xmlDictLookup shows up as a popular function, but that may be a red
herring.  I'll include the top 5 here, but you can run this command, or adjust
the truncation to view any amount you'd like:

              libc.so.1`_free_unlocked+0x4c
              libc.so.1`free+0x2b
              libxml2.so.2`xmlFreeNodeList+0x19e
              libxml2.so.2`xmlFreeNodeList+0xb6
              libxml2.so.2`xmlFreeNodeList+0xb6
              libxml2.so.2`xmlFreeDoc+0xac
              xml`main+0x55
              xml`_start+0x80
               61

              libxml2.so.2`xmlDictLookup+0x60
              libxml2.so.2`xmlParseNCName+0xab
              libxml2.so.2`xmlParseQName+0x44
              libxml2.so.2`xmlParseAttribute2+0x54
              libxml2.so.2`xmlParseStartTag2+0x246
              libxml2.so.2`xmlParseElement+0x99
              libxml2.so.2`xmlParseContent+0x12c
              libxml2.so.2`xmlParseElement+0x134
              libxml2.so.2`xmlParseContent+0x12c
              libxml2.so.2`xmlParseElement+0x134
              libxml2.so.2`xmlParseDocument+0x33b
              libxml2.so.2`xmlDoRead+0x6b
              libxml2.so.2`xmlReadMemory+0x36
              xml`main+0x4c
              xml`_start+0x80
               66

              libxml2.so.2`xmlParseStartTag2+0xb7e
              libxml2.so.2`xmlParseElement+0x99
              libxml2.so.2`xmlParseContent+0x12c
              libxml2.so.2`xmlParseElement+0x134
              libxml2.so.2`xmlParseContent+0x12c
              libxml2.so.2`xmlParseElement+0x134
              libxml2.so.2`xmlParseDocument+0x33b
              libxml2.so.2`xmlDoRead+0x6b
              libxml2.so.2`xmlReadMemory+0x36
              xml`main+0x4c
              xml`_start+0x80
               66

              libc.so.1`_free_unlocked+0x53
              libc.so.1`free+0x2b
              libxml2.so.2`xmlFreeNodeList+0x19e
              libxml2.so.2`xmlFreeNodeList+0xb6
              libxml2.so.2`xmlFreeNodeList+0xb6
              libxml2.so.2`xmlFreeDoc+0xac
              xml`main+0x55
              xml`_start+0x80
               72

              libxml2.so.2`xmlDictLookup+0x60
              libxml2.so.2`xmlParseNCName+0xab
              libxml2.so.2`xmlParseQName+0x44
              libxml2.so.2`xmlParseStartTag2+0x14c
              libxml2.so.2`xmlParseElement+0x99
              libxml2.so.2`xmlParseContent+0x12c
              libxml2.so.2`xmlParseElement+0x134
              libxml2.so.2`xmlParseContent+0x12c
              libxml2.so.2`xmlParseElement+0x134
              libxml2.so.2`xmlParseDocument+0x33b
              libxml2.so.2`xmlDoRead+0x6b
              libxml2.so.2`xmlReadMemory+0x36
              xml`main+0x4c
              xml`_start+0x80
               80

Hope this helps.  It may be that you've got a newer / more optimized
version of libxml2 on your Linux box.

-j


On Mon, Apr 28, 2008 at 04:21:19PM -0400, Matty wrote:
> Howdy,
> 
> I have been working with one of our developers to port a Linux application to
> opensolaris. While benchmarking the app, we noticed that it ran 2x slower on
> a Nevada build 85 host than it did on Linux. The application utilizes libxml 
> to
> transform XML documents, and I think we have narrowed down the discrepancy
> to the xmlReadMemory libxml function. We created a test program to call
> xmlReadMemory100k times, and measured the program execution time. Here
> are the results we got on the same hardware (Sun X2200):
> 
> CentOS Linux 5:
> 
> %  ./xml
> 100000 iter in 9.581637 sec
> 
> Nevada build 85:
> 
> % ./xml
> 100000 iter in 15.983286 sec
> 
> When we ran the collect / er_print tools to generate execution
> profiles, we noticed that
> the top functions make heavy use of memory. Based on a number of documents I
> read on the Sun developer site, I tried different compiler flags,
> memory allocators
> (mtmalloc, libumem) and compilers (Sun studio 12). These items didn't
> help, and we
> are currently a bit mystified. The test program our developer wrote is
> available here:
> 
> http://prefetch.net/xml.c
> 
> And the following command line can be used to compile it;
> 
> $  gcc -O3 -o xml `/usr/bin/xml2-config --libs --cflags` xml.c
> 
> That said, does anyone have any thoughts on why the same code would be slower
> on Nevada? I really want to get our app running on opensolaris, but
> it's hard to
> justify the port when it when it's not running as well under Linux. :(
> 
> Thanks for any insight,
> - Ryan
> _______________________________________________
> perf-discuss mailing list
> perf-discuss@opensolaris.org
_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org

Reply via email to