On May 9, 2011, at 6:45 AM, Andreas Kostler wrote:

> Hi all,
> I'm trying to calculate the moving average of a certain window size.
> One straight forward approach is:
> (defn lazy-avg [coll]
>  (let [[sum cnt] (reduce
>                   (fn [[v c] val] [(+ val v) (inc c)])
>                   [0 0]
>                   coll)]
>    (if (zero? cnt) 0 (/ sum cnt))))
> 
> (let [window-size 5
>      n 1000000]
>  (map lazy-avg (partition window-size 1 (range 0 n)))
> 
> This takes about 2 seconds for 10^6 elements on my box. How can I
> improve the runtime?
> A slightly more performant (not much) approach keeping a rolling sum
> would be:
> 
> (defn partialsums [start lst]
>  (lazy-seq
>    (if-let [lst (seq lst)]
>          (cons start (partialsums (+ start (first lst)) (rest lst)))
>          (list start))))
> 
> (defn sliding-window-moving-average [window lst]
>  (map #(/ % window)
>       (let [start   (apply + (take window lst))
>             diffseq (map - (drop window lst) lst)]
>         (partialsums start diffseq))))
> 
> However, this causes the jvm to run out of heap space for n > 10^6
> 
> Is ~ 2 seconds the best I can do?

I took your first example and put it into a Clojure source file all its own, 
using AOT compilation and then timing the results of running the program from 
the command line from start to finish.  I was using a 3-year old MacBook Pro 
with 2.4 GHz Intel Core 2 Duo, Mac OS X 10.6.7, Java provided by Apple 
(1.6.0_24-b07-334-10M3326, java -server is a 64-bit JVM).  Note that I've 
previously found significantly different run time and memory uses from 
different JVMs running on different OSs for the same (hardware, Clojure source 
file, Clojure version) combination.  All of my measurements were for a window 
size of 5 and n=10^6.

For your first set of code above, the lowest elapsed time I got on 3 different 
runs was 10.338 sec.  Total memory GCed was about 1.8 GBytes (I was using 
-Xmx256m on the java command line).

For your second set of code above, the lowest elapsed time among 3 runs was 
6.423 sec, with about 570 Mbytes of memory GCed during the run (also using 
-Xmx256m).

With the source code below, I got as low as 3.329 sec of elapsed time with 
about 340 MBytes of memory GCed during the run (again -Xmx256m).  Unlike your 
programs, this one is not lazy on the input sequence -- it always processes the 
whole thing.  I don't hold it up as a model of Clojure code for readability or 
maintainability, but for squeezing run time out of the inner loop.

Andy Fingerhut


(ns movavg
  (:gen-class))

(set! *warn-on-reflection* true)


(defn sliding-window-moving-average [window lst]
  (let [w (int window)
        w-1 (int (dec w))]
    (loop [rolling-sum (apply + (take w lst))
           last-w (object-array (take w lst))
           i (int 0)
           avgs (transient [(/ rolling-sum w)])
           lst (drop w lst)]
      (if-let [lst (seq lst)]
        (let [old-val (aget last-w i)
              new-val (first lst)
              next-rolling-sum (- (+ rolling-sum new-val) old-val)]
          (aset last-w i new-val)
          (recur next-rolling-sum
                 last-w
                 (if (== i w-1) (int 0) (inc i))
                 (conj! avgs (/ next-rolling-sum w))
                 (rest lst)))
        ;; else
        (persistent! avgs)))))


(defn -main [& args]
  (let [window-size (Integer/parseInt (nth args 0))
        n (Integer/parseInt (nth args 1))]
    (println (sliding-window-moving-average window-size (range 0 n)))))

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Reply via email to