Part of the problem is that these allocations are very small:

# dtrace -n 'pid$target::malloc:entry { @a["allocsz"] = quantize(arg0); }' -c 
/tmp/xml

  allocsz                                           
           value  ------------- Distribution ------------- count    
               1 |                                         0        
               2 |                                         300000   
               4 |@@@@@                                    4700005  
               8 |@@                                       1600006  
              16 |@@@@@                                    4300015  
              32 |@@@@@@@@@@@@@@@@@@@@@@@@@@@              24000006 
              64 |                                         200001   
             128 |                                         400001   
             256 |                                         100000   
             512 |                                         0        
            1024 |                                         100000   
            2048 |                                         100000   
            4096 |                                         0        
            8192 |                                         100000   
           16384 |                                         0        

After seeing this, I took a look at the exact breakdown of the
allocation sizes:

# dtrace -n 'pid$target::malloc:entry [EMAIL PROTECTED] = count();}' -c /tmp/xml

               12                1
               96                1
              200                1
               21           100000
               43           100000
               44           100000
               51           100000
               61           100000
               75           100000
               88           100000
              128           100000
              147           100000
              181           100000
              220           100000
              440           100000
             1024           100000
             2048           100000
             8194           100000
                8           100001
               52           100001
                6           100002
               36           100004
               24           100005
               33           200000
                4           200001
               17           200001
                9           200003
                3           300000
               10           300000
               13           300000
               14           300000
               25           300000
               28           400000
               11           400001
               20           700009
               40           900000
                5           900001
               16          2500000
                7          3500001
               48          3800001
               60         18500000

The most frequent malloc call is to allocate 60 bytes.  I believe that
we have a known issue with small mallocs on Solaris.  There's a bug open
for this somewhere; however, I can't find it's number at the moment.

Another problem that you may have run into is the 32-bit versus 64-bit
compilation problem.  I was able to shave about 10 seconds off my
runtime by compiling your testcase as a 64-bit app instead of a 32-bit
one:

$ gcc -O3 -o xml `/usr/bin/xml2-config --libs --cflags` xml.c
$ file xml
xml:            ELF 32-bit LSB executable 80386 Version 1 [FPU], dynamically 
linked, not stripped, no debugging information available
$ ./xml
100000 iter in 22.749836 sec

versus:

$ gcc -m64 -O3 -o xml `/usr/bin/xml2-config --libs --cflags` xml.c
$ file xml
xml:            ELF 64-bit LSB executable AMD64 Version 1, dynamically linked, 
not stripped, no debugging information available
$ ./xml
100000 iter in 13.785916 sec


-j


On Wed, Apr 30, 2008 at 06:44:31PM -0400, Matty wrote:
> On Wed, Apr 30, 2008 at 6:26 PM, David Lutz <[EMAIL PROTECTED]> wrote:
> > If your application is single threaded, you could try using the
> >  bsdmalloc library.  This is a fast malloc, but it is not multi-thread
> >  safe and will also tend to use more memory than the default
> >  malloc.  For  a comparison of different malloc libraries, look
> >  at the NOTES section at the end of umem_alloc(3MALLOC).
> >
> >  I got the following result with your example code:
> >
> >
> >  $ gcc -O3 -o xml `/usr/bin/xml2-config --libs --cflags` xml.c
> >  $ ./xml
> >  100000 iter in 21.445672 sec
> >  $
> >  $ gcc -O3 -o xml `/usr/bin/xml2-config --libs --cflags` xml.c -lbsdmalloc
> >  $ ./xml
> >  100000 iter in 12.761969 sec
> >  $
> >
> >  I got similar results using Sun Studio 12.
> >
> >  Again, bsdmalloc is not multi-thread safe, so use it with caution.
> 
> Thanks David. Does anyone happen to know why the memory allocation
> libraries in Solaris are so much slower than their Linux counterparts? If
> the various malloc implementations were a second or two slower, I could
> understand. But they appear to be 10 - 12 seconds slower in our specific
> test case, which seems kinda odd.
> 
> Thanks,
> - Ryan
> _______________________________________________
> perf-discuss mailing list
> perf-discuss@opensolaris.org
_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org

Reply via email to