Re: Dubious performance characteristics of hash-maps

Daniel Ribeiro Tue, 23 Feb 2010 10:56:53 -0800

IBM has this very comprehensive guide on java benchmarking:
http://www.ibm.com/developerworks/java/library/j-benchmark1.html


It is quite long, but there are so many parameters and optimizations
on the JVM that overlooking them is not a real option for a real
benchmark.

Regards,
Daniel

On Feb 23, 11:29 am, Mark Tomko <mjt0...@gmail.com> wrote:
> I'd suggest allocating more heap than the default JVM settings, or
> running the JVM in server mode.  I'm not sure what Haskell's
> underlying implementation is, but I'd assume that it can grow its heap
> (somewhat) arbitrarily, which the JVM cannot, beyond a fixed bound.
>
> On Feb 23, 12:51 am, "Edward Z. Yang" <ezy...@mit.edu> wrote:
>
> > I'd first like to state that I went into this exercise expecting
> > Bagwell's Hash Array Mapped Tries to blow the competition out of the
> > water; I'd get a fast functional map and give it to Haskell and people
> > would rejoice.
>
> > Instead, I found a functional hash-map that was approximately four times
> > slower than Haskell's IntMap.  I find this puzzling, and would love to
> > puzzle it out with you folks.
>
> > First, the test case:
>
> >     (ns maptest (:gen-class))
>
> >     (defn mk-random-stream []
> >         (let [r (new ec.util.MersenneTwisterFast)]
> >                  (repeatedly (fn [] (. r (nextInt))))))
>
> >     (defn -main [sn]
> >       (let [n  (Integer/parseInt sn)
> >                   rs (take (* 2 n) (mk-random-stream))]
> >        (time (let [m  (apply hash-map (take (* n 2) (interleave rs rs)))]
> >             (reduce
> >                     (fn [s k] (let [v (get m k)]
> >                                             (+ s (if (nil? v) 0 v))))
> >                     0
> >                     (take n (drop (/ n 2) rs)))))))
>
> > We generate an n-sized hash tree, and then do n accesses, half of which
> > match and half of which miss.  We compute the sum of all the values that
> > catch.  We see the following characteristics:
>
> >        n   Clojure   Haskell (HAMT)   Haskell (IntMap)
> >      32K     .56s            .30s           .09s
> >      64K     .84s            .69s           .22s
> >     128K    1.65s           1.50s           .46s
> >     256K    2.94s           3.23s          1.17s
> >     512K    7.32s           6.96s          2.70s
>
> > I was careful to compile the Clojure code before attempting to run it.
> > I also pre-allocate the entire random sequence to prevent it from
> > muddying the execution time (although I am using Sean Luke's excellent
> > fast Mersenne Twister [1]).  The full string I used to invoke the
> > program was:
>
> >     /usr/lib/jvm/java-6-sun-1.6.0.16/bin/java -Dfile.encoding=UTF-8 \
> >         -classpath $CLASSES maptest $@
>
> > The fact that it's similar in performance to the Haskell
> > reimplementation (I'd say that the variance is not particularly
> > significant and both are about on par performance wise) seems to
> > validate the testing methodology, I hope, though the real proof of the
> > pudding would lie in implementing Haskell-style IntMap in Java.
>
> > You can see the test code for the Haskell versions here:
>
> >    http://github.com/ezyang/hamt
>
> > So... what's up with these numbers?  Is my methodology wrong?  Is the
> > garbage collector sucking?  Is the algorithm just not as good as it
> > makes out to be?
>
> > Cheers,
> > Edward
>
> > [1]http://www.cs.gmu.edu/~sean/research/

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: Dubious performance characteristics of hash-maps

Reply via email to