On Jun 23, 4:25 am, Jules <julesjac...@gmail.com> wrote: > Let N be the total number of elements in your collection (e.g. > 1000,000), and n the number of elements that you want to take out of > this collection (e.g 10). > > By sorting the collection of N elements you spend N log N time. By > repeatedly adding an to the small collection and removing the minimum > you spend N log n time (if you have a good sorted collection that > takes log n time to insert). > > The algorithm spelled out in imperative pseudocode is: > > def top_n(n,coll) > let result = new SortedCollection(take n from coll) > for x in (drop n from coll): > result.add(x) > result.deleteMinimum() // adding 1 element and removing 1 keeps > the result collection at size n > return result > > Hope this helps, > Jules
Here's the functional version (working Clojure code): (defn top [n comptr coll] (let [m (reduce #(assoc %1 %2 true) (sorted-map-by comptr) (take n coll))] (keys (reduce #(let [t (assoc %1 %2 true)] (dissoc t (first (last t)))) m (drop n coll))))) Why a map with useless values? There's no sorted-set-by in clojure.core. :( This is the result: user=> (time (top 10 #(> %1 %2) (range 5000000))) "Elapsed time: 15755.155 msecs" (4999999 4999998 4999997 4999996 4999995 4999994 4999993 4999992 4999991 4999990) user=> (time (take 10 (sort #(> %1 %2) (range 5000000)))) "Elapsed time: 7938.67752 msecs" (4999999 4999998 4999997 4999996 4999995 4999994 4999993 4999992 4999991 4999990) This is a worst-case sort, in a sense, since the sequence has to be completely reversed. Curiously, take 10 sort is faster than top 10 by a factor of 2. It seems sorting five million items all at once is faster than five million sequential sorted inserts into a ten-item tree. So top is using about twice as much time, and one five-hundred- thousandth as much memory. Lesson: use top when you don't want a big lazy sequence residing in memory all at once and use sort when you want speed. :) --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~----------~----~----~----~------~----~------~--~---