On May 9, 2011, at 6:45 AM, Andreas Kostler wrote: > Hi all, > I'm trying to calculate the moving average of a certain window size. > One straight forward approach is: > (defn lazy-avg [coll] > (let [[sum cnt] (reduce > (fn [[v c] val] [(+ val v) (inc c)]) > [0 0] > coll)] > (if (zero? cnt) 0 (/ sum cnt)))) > > (let [window-size 5 > n 1000000] > (map lazy-avg (partition window-size 1 (range 0 n))) > > This takes about 2 seconds for 10^6 elements on my box. How can I > improve the runtime? > A slightly more performant (not much) approach keeping a rolling sum > would be: > > (defn partialsums [start lst] > (lazy-seq > (if-let [lst (seq lst)] > (cons start (partialsums (+ start (first lst)) (rest lst))) > (list start)))) > > (defn sliding-window-moving-average [window lst] > (map #(/ % window) > (let [start (apply + (take window lst)) > diffseq (map - (drop window lst) lst)] > (partialsums start diffseq)))) > > However, this causes the jvm to run out of heap space for n > 10^6 > > Is ~ 2 seconds the best I can do?
I took your first example and put it into a Clojure source file all its own, using AOT compilation and then timing the results of running the program from the command line from start to finish. I was using a 3-year old MacBook Pro with 2.4 GHz Intel Core 2 Duo, Mac OS X 10.6.7, Java provided by Apple (1.6.0_24-b07-334-10M3326, java -server is a 64-bit JVM). Note that I've previously found significantly different run time and memory uses from different JVMs running on different OSs for the same (hardware, Clojure source file, Clojure version) combination. All of my measurements were for a window size of 5 and n=10^6. For your first set of code above, the lowest elapsed time I got on 3 different runs was 10.338 sec. Total memory GCed was about 1.8 GBytes (I was using -Xmx256m on the java command line). For your second set of code above, the lowest elapsed time among 3 runs was 6.423 sec, with about 570 Mbytes of memory GCed during the run (also using -Xmx256m). With the source code below, I got as low as 3.329 sec of elapsed time with about 340 MBytes of memory GCed during the run (again -Xmx256m). Unlike your programs, this one is not lazy on the input sequence -- it always processes the whole thing. I don't hold it up as a model of Clojure code for readability or maintainability, but for squeezing run time out of the inner loop. Andy Fingerhut (ns movavg (:gen-class)) (set! *warn-on-reflection* true) (defn sliding-window-moving-average [window lst] (let [w (int window) w-1 (int (dec w))] (loop [rolling-sum (apply + (take w lst)) last-w (object-array (take w lst)) i (int 0) avgs (transient [(/ rolling-sum w)]) lst (drop w lst)] (if-let [lst (seq lst)] (let [old-val (aget last-w i) new-val (first lst) next-rolling-sum (- (+ rolling-sum new-val) old-val)] (aset last-w i new-val) (recur next-rolling-sum last-w (if (== i w-1) (int 0) (inc i)) (conj! avgs (/ next-rolling-sum w)) (rest lst))) ;; else (persistent! avgs))))) (defn -main [& args] (let [window-size (Integer/parseInt (nth args 0)) n (Integer/parseInt (nth args 1))] (println (sliding-window-moving-average window-size (range 0 n))))) -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en