tmountain a écrit : > Very cool. I actually cleaned up the code a little bit more this > morning trying to speed things up a bit. It's still not as fast as I'd > like, but I'm not up to speed on Closure optimization either, so I > could be missing something. >
There are two things that I noticed in your code: - you use nth on seq (linear access), - you append elements to seqs. It would be better to use vectors instead of seqs: - random access, - when you conj an element to a vector it is appended. Below is the "vectorized" version, it runs (on my box) twice as fast as your original code. (I also removed the in-loop building of the string because it was needless.) (ns markov (use clojure.contrib.str-utils)) (defn rand-elt "Return a random element of this vector or seq" [s] (nth s (rand-int (count s)))) (defn clean [txt] "clean given txt for symbols disruptive to markov chains" (let [new-txt (re-gsub #"[:;,^\"()]" "" txt) new-txt (re-gsub #"'(?!(d|t|ve|m|ll|s|de|re))" "" new-txt)] new-txt)) (defn chain-lengths [markov-chain] "return a set of lengths for each element in the collection" (let [markov-keys (map keys markov-chain)] (set (for [x markov-keys] (count x))))) (defn max-chain-length [markov-chain] "return the length lf the longest chain" (apply max (chain-lengths markov-chain))) (defn chain "Take a list of words and build a markov chain out of them. The length is the size of the key in number of words." ([words] (chain words 3)) ([words length] (let [words (concat (repeat length nil) words) suffixes (take-while #(seq (drop length %)) (iterate rest words))] (reduce (fn [markov-chain [a b c d]] (merge-with into markov-chain {[a b c] [d]})) {} suffixes)))) (defn split-sentence [text] "Convert a string to a collection on common boundaries" (filter seq (re-split #"[,.!?()\d]+\s*" text))) (defn file-chain "Create a markov chain from the contents of a given file" ([file] (file-chain file 3)) ([file length] (let [sentences (split-sentence (slurp file))] (reduce #(merge-with into %1 (chain (re-split #"\s+" %2))) {} sentences)))) (defn construct-sentence "Build a sentence from a markov chain structure. Given a Markov chain (any size key), Seed (to start the sentence) and Proc (a function for choosing the next word), returns a sentence composed until is reaches the end of a chain (an end of sentence)." ([markov-chain] (construct-sentence markov-chain nil rand-elt)) ([markov-chain seed] (construct-sentence markov-chain seed rand-elt)) ([markov-chain seed proc] (let [seed (or seed (rand-elt (keys markov-chain))) next-key #(concat (rest %) [(proc (markov-chain %))]) logorrhea (map first (iterate next-key seed)) sentence (take-while identity (drop-while nil? logorrhea))] (str-join " " sentence)))) hth, Christophe -- Professional: http://cgrand.net/ (fr) On Clojure: http://clj-me.blogspot.com/ (en) --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~----------~----~----~----~------~----~------~--~---