On Sun, Sep 13, 2009 at 7:09 PM, Richard Newman <holyg...@gmail.com> wrote:
>
> Thought I'd share with the group. Clojure's sets are fast!
>
> http://www.holygoat.co.uk/blog/entry/2009-09-13-1

Nice writeup!

...and I hate to be picky, but you can probably compare
strings sizes using a faster mechanism than the examples you
gave.

While 'count' can give the size of a string, it first checks
it's type against a couple other possibilities before
calling the string's .length method.

        (defn t [] (dotimes [i 2e7] (= 10 (count "123"))))
        (time (t)) ; run this a few times to let HotSpot work its magic
        --> "Elapsed time: 1000.960992 msecs"

If we know it's a string, we can call the .length method
directly.  If you also type-hint the string, this can make
the code rather uglier, and since this is hardly ever the
bottleneck in any application, it may not be worth it.  But
when you're benchmarking tiny opertions repeated a huge
number of times, this sort of detail starts to matter:

        (defn t [] (dotimes [i 2e7] (= 10 (.length "123"))))
        (time (t))
        --> "Elapsed time: 100.096957 msecs"

There's no need to type hint the string there because it's
a literal and the Clojure compiler already knows its type.

It's time to see if == will help us, since we are after all
dealing with numbers here:

        (defn t [] (dotimes [i 2e7] (== 10 (.length "123"))))
        (time (t))
        "Elapsed time: 96.791805 msecs"

So that's a bit better, but it's also worth looking at
where our numbers may be getting boxed or unboxed.

        (use '[clojure.contrib.repl-utils :only (expression-info)])
        (expression-info '10)
        --> {:class java.lang.Integer, :primitive? false}

        (expression-info '(.length "123"))
        --> {:class int, :primitive? true}

So now we're comparing a boxed Integer 10 with an unboxed
int returned by 'length'.  We can do better:

        (defn t [] (dotimes [i 2e7] (== (int 10) (.length "123"))))
        (time (t)) ; Watch HotSpot remove orders of magnitude, until:
        --> "Elapsed time: 0.044282 msecs"

That's really really fast.  That elapsed time is too small:
we'd have to increase the test size to get any useful
conclusion besides "really fast".  In fact, I almost wonder
if HotSpot is somehow memoizing the expression all by
itself.

Anyway, Clojure's sets by themselves appear to be fast
enough for you, or I assume you wouldn't have described them
as "crazy fast".  However, *if* you discover later that
they're are not fast enough you may be able to do length
tests faster than you were thinking.  This may yet be
a route to explore for improved overall speed.

--Chouser

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to