On Jun 23, 10:46 pm, Bradbev <brad.beveri...@gmail.com> wrote: > A further optimization would be to keep track of the lowest value in > your "keep" set. A simple compare against that value will eliminate > many of the add/removes from the keep set.
(defn top [n comptr coll] (let [m (reduce #(assoc %1 %2 true) (sorted-map-by comptr) (take n coll))] (keys (reduce #(let [worst (first (last %1))] (if (comptr %2 worst) (assoc (dissoc %1 worst) %2 true) %1)) m (drop n coll))))) user=> (time (top 10 #(< %1 %2) (range 5000000))) "Elapsed time: 11768.17628 msecs" (0 1 2 3 4 5 6 7 8 9) user=> (time (top 10 #(> %1 %2) (range 5000000))) "Elapsed time: 17919.84428 msecs" (4999999 4999998 4999997 4999996 4999995 4999994 4999993 4999992 4999991 4999990) Marginally slower than the version without this optimization in the worst case where every replacement must be done, and about 25% faster in the best case where none must be done. Even then, it's still beat by (take 10 (sort ... )). Detailed results: user=> (dotimes [_ 10] (time (take 10 (sort #(> %1 %2) (range 5000000))))) "Elapsed time: 4669.87556 msecs" "Elapsed time: 4893.38316 msecs" "Elapsed time: 5013.23172 msecs" "Elapsed time: 4990.9178 msecs" "Elapsed time: 4892.86396 msecs" "Elapsed time: 4875.61796 msecs" "Elapsed time: 4897.41252 msecs" "Elapsed time: 4808.3688 msecs" "Elapsed time: 4885.01024 msecs" "Elapsed time: 4825.45776 msecs" nil user=> (dotimes [_ 10] (time (take 10 (sort #(< %1 %2) (range 5000000))))) "Elapsed time: 1915.16492 msecs" "Elapsed time: 1893.42468 msecs" "Elapsed time: 1787.757 msecs" "Elapsed time: 1845.62508 msecs" "Elapsed time: 1795.7726 msecs" "Elapsed time: 1741.83024 msecs" "Elapsed time: 1893.88368 msecs" "Elapsed time: 1782.7532 msecs" "Elapsed time: 1860.6954 msecs" "Elapsed time: 1800.88284 msecs" nil user=> (dotimes [_ 10] (time (top 10 #(> %1 %2) (range 5000000)))) "Elapsed time: 15386.11028 msecs" "Elapsed time: 14990.3236 msecs" "Elapsed time: 16132.34848 msecs" "Elapsed time: 15235.8524 msecs" "Elapsed time: 15176.44992 msecs" "Elapsed time: 14949.18796 msecs" "Elapsed time: 15008.48088 msecs" "Elapsed time: 14931.59616 msecs" "Elapsed time: 15066.244 msecs" "Elapsed time: 15013.44288 msecs" nil user=> (dotimes [_ 10] (time (top 10 #(< %1 %2) (range 5000000)))) "Elapsed time: 9880.3546 msecs" "Elapsed time: 10076.03504 msecs" "Elapsed time: 10145.5228 msecs" "Elapsed time: 10393.19332 msecs" "Elapsed time: 10482.33532 msecs" "Elapsed time: 10184.82972 msecs" "Elapsed time: 10156.60292 msecs" "Elapsed time: 10089.50916 msecs" "Elapsed time: 10266.2736 msecs" "Elapsed time: 10054.39296 msecs" nil JIT has definitely had time to run. Normalized, we have these proportions: Sort best case: 3 Sort worst case: 8 Top best case: 17 Top worst case: 25 Even with the optimization, sort somehow beats top for speed. It looks like top is best used to avoid major memory consumption for long seqs; if you have the memory and need the speed, sort's better. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en -~----------~----~----~----~------~----~------~--~---