Hi all,
I was just playing about with sorting the other day and I came up with
the following. I'll say upfront that performance wasn't my goal, however
the performance of `freqsort` caught me completely off guard! It's
surprisingly decent to say the least... I'm truly amazed given that my
grandma could literally read this (!!!) :)
(defn sorted-frequencies
"Same as `clojure.core/frequencies` but returns a sorted-map." [coll]
(let [?increment (fnil unchecked-inc0)]
(reduce (fn [counts x]
(update counts x ?increment))
(sorted-map)
coll)))
(defn freqsort
"A sorting technique based on `clojure.core/frequencies` + sorted-map
(for the actual sorting), and transducers (for the final concatenation),
exhibiting very decent performance (for 1E6 elements just above 2x the
core fn)." [xs]
(into [] (comp (map (fn [[x n]]
;; the x occurred n times (repeat n x)))
cat)
(sorted-frequencies xs)))
I've seen/implemented various other sorting algorithms in Clojure, and
this is by far the fastest and cleanest one i was able to find/write.
;;=================REPL_OUTPUT=============
(use '(criterium.core))
(def data (doall (repeatedly 100 (partial rand-int 5000))))
=> #'treajure.encore/data
(quick-bench (freqsort data))
Evaluation count : 9450 in 6 samples of 1575 calls.
Execution time mean : 63.675363 µs
Execution time std-deviation : 1.144335 µs
Execution time lower quantile : 62.496493 µs ( 2.5%)
Execution time upper quantile : 65.035663 µs (97.5%)
Overhead used : 1.979934 ns
=> nil
(quick-bench (sort data))
Evaluation count : 36642 in 6 samples of 6107 calls.
Execution time mean : 16.496481 µs *4x faster than `freqsort`*
Execution time std-deviation : 134.094210 ns
Execution time lower quantile : 16.361877 µs ( 2.5%)
Execution time upper quantile : 16.641270 µs (97.5%)
Overhead used : 1.979934 ns
=> nil
(def data (doall (repeatedly 1000 (partial rand-int 5000))))
=> #'treajure.encore/data
(quick-bench (freqsort data))
Evaluation count : 678 in 6 samples of 113 calls.
Execution time mean : 929.610743 µs
Execution time std-deviation : 83.275723 µs
Execution time lower quantile : 885.340646 µs ( 2.5%)
Execution time upper quantile : 1.073217 ms (97.5%)
Overhead used : 1.979934 ns
Found 1 outliers in 6 samples (16.6667 %)
low-severe 1 (16.6667 %)
Variance from outliers : 15.7652 % Variance is moderately inflated by
outliers
=> nil
(quick-bench (sort data))
Evaluation count : 1986 in 6 samples of 331 calls.
Execution time mean : 305.248705 µs *3x faster than
`freqsort`*
Execution time std-deviation : 2.572377 µs
Execution time lower quantile : 303.218408 µs ( 2.5%)
Execution time upper quantile : 309.553407 µs (97.5%)
Overhead used : 1.979934 ns
Found 1 outliers in 6 samples (16.6667 %)
low-severe 1 (16.6667 %)
Variance from outliers : 13.8889 % Variance is moderately inflated by
outliers
=> nil
(def data (doall (repeatedly 10000 (partial rand-int 5000))))
=> #'treajure.encore/data
(quick-bench (freqsort data))
Evaluation count : 54 in 6 samples of 9 calls.
Execution time mean : 12.557872 ms
Execution time std-deviation : 976.670436 µs
Execution time lower quantile : 12.031381 ms ( 2.5%)
Execution time upper quantile : 14.247191 ms (97.5%)
Overhead used : 1.979934 ns
Found 1 outliers in 6 samples (16.6667 %)
low-severe 1 (16.6667 %)
Variance from outliers : 15.4707 % Variance is moderately inflated by
outliers
=> nil
(quick-bench (sort data))
Evaluation count : 150 in 6 samples of 25 calls.
Execution time mean : 4.058038 ms *3x faster than `freqsort`*
Execution time std-deviation : 86.703920 µs
Execution time lower quantile : 3.979264 ms ( 2.5%)
Execution time upper quantile : 4.159908 ms (97.5%)
Overhead used : 1.979934 ns
=> nil
(def data (doall (repeatedly 100000 (partial rand-int 5000))))
=> #'treajure.encore/data
(quick-bench (freqsort data))
Evaluation count : 6 in 6 samples of 1 calls.
Execution time mean : 132.138750 ms
Execution time std-deviation : 10.379214 ms
Execution time lower quantile : 138.162338 ms ( 2.5%)
Execution time upper quantile : 161.986112 ms (97.5%)
Overhead used : 1.979934 ns
Found 1 outliers in 6 samples (16.6667 %)
low-severe 1 (16.6667 %)
Variance from outliers : 15.2716 % Variance is moderately inflated by
outliers
=> nil
(quick-bench (sort data))
Evaluation count : 12 in 6 samples of 2 calls.
Execution time mean : 57.354588 ms *~2x faster than
`freqsort`*
Execution time std-deviation : 1.234770 ms
Execution time lower quantile : 55.048058 ms ( 2.5%)
Execution time upper quantile : 58.109885 ms (97.5%)
Overhead used : 1.979934 ns
=> nil
;;===========================================
GO CLOJURE :)
Regards,
Jim
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
---
You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.