I needed a random sampling function for work and wrote this in
Clojure.

(defn random-sample
  "Take a random sample of size `sample-size' from the `items'
sequence.

This uses Algorithm R -- random sampling with a reservoir.  It
requires O(`sample-size') space and does not need to know the size of
`items' ahead of time.  Described in Knuth's The Art of Computer
Programming, volume 2."
  [sample-size items]
  (let [maybe-insert-item
        (fn [[num current] item]
          (if (< num sample-size)
            [(inc num) (conj current item)]
            (let [index (rand-int num)]
              (if (< index sample-size)
                [(inc num) (assoc current index item)]
                [(inc num) current]))))]
    (second (reduce maybe-insert-item [0 []] items))))


It seems to work fine, but it strikes me as a little busy with ()[],
even for Lisp. Does anyone have any pointers? Will the internal
function "maybe-insert-item" be evaluated every time "random-sample"
is called, and if so, is that worth worrying about wrt. program
optimization?

Does this sort of thing belong in clojure-contrib or somewhere else?
It seems awfully short but it's a nice function to have, at least for
me.

Thanks,
Steve

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To post to this group, send email to clojure@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to