I have written some Clojure code to implement java.lang.CharSequence
that is constructed with a length and an ISeq of strings.  I need this
because I want to pass the resulting CharSequence into Java's regex
library.  I got the thing working (thanks to the docs and some good
examples that I found in the discussion group) and I'm trying to
optimize it now.  I'm willing to accept the Clojure code being ~2x
slower than the Java equivalent, but the best I can do is 10x slower.

The code is below.  After examining the resulting bytecode, it looks
to me that the problem is that the Clojure compiler dispatches every
method to an IFn that is bound as part of the class initialization.
It's a cool idea that probably saved implementation effort but I'm
pretty sure all the boxing/unboxing of primitives is what's killing
the performance.

My question is:  Can someone come up with a better way of implementing
this to avoid the performance problems?

;; char_sequence

(ns SeqCharSequence
   (:gen-class
   :implements [java.lang.CharSequence]
   :init init
   :state state
   :constructors {[Integer Object] []})
  (:use
        [clojure.contrib.test-is]))

 (defstruct state-struct
        :buffer
        :strings
        :length)

(defn -init [length strings]
        [[]
        (struct state-struct (StringBuilder.) (atom strings) length)])

(defmacro ensure-capacity [state index]
        `(let [{#^StringBuilder buffer# :buffer} ~state]
                (while (>= ~index (. buffer# (length)))
                        (let [{strings# :strings} ~state
                               strings-seq# @strings#
                               #^String newString# (first strings-seq#)]
                                (. buffer# (append newString#))
                                (compare-and-set! strings# strings-seq# (rest 
strings-seq#))))
                buffer#))

(defn -charAt [#^SeqCharSequence this i]
        (let [#^StringBuilder buffer (ensure-capacity (.state this) i)]
                (. buffer (charAt i))))

(defn -subSequence [#^SeqCharSequence this start end]
        (let [state (.state this)
                                #^StringBuilder buffer (ensure-capacity state 
end)]
                (. buffer subSequence start end)))

(defn -length [#^SeqCharSequence this]
        (let [{length :length} (.state this)]
                length))

(defn -toString [#^SeqCharSequence this]
        (let [length (. this (length))
                                #^CharSequence charSequence (. this subSequence 
0 length)]
                (String. charSequence)))

(deftest performance-test
        (let [words ["hello\n" "world\n"]
                          #^java.util.regex.Pattern pattern #"\n"]
                (time (dotimes [_ 10000000]
                        (let [#^SeqCharSequence cs (SeqCharSequence. 10 words)]
                                (. pattern split cs))))))

(deftest test-length
        (let [cs (SeqCharSequence. 10 ["hello" "world"])]
                (is (= 10 (. cs (length))))))

(deftest test-toString
        (let [cs (SeqCharSequence. 10 ["hello" "world"])]
                (is (= "helloworld" (. cs (toString))))))

fyi - I made ensure-capacity a macro in an  effort to avoid boxing/
unboxing before I realized how the dispatching to IFns worked.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to