Hi all,

I was just playing about with sorting the other day and I came up with the following. I'll say upfront that performance wasn't my goal, however the performance of `freqsort` caught me completely off guard! It's surprisingly decent to say the least... I'm truly amazed given that my grandma could literally read this (!!!) :)


(defn sorted-frequencies
  "Same as `clojure.core/frequencies` but returns a sorted-map." [coll]
  (let [?increment (fnil unchecked-inc0)]
    (reduce (fn [counts x]
              (update counts x ?increment))
            (sorted-map)
            coll)))

(defn freqsort
"A sorting technique based on `clojure.core/frequencies` + sorted-map (for the actual sorting), and transducers (for the final concatenation), exhibiting very decent performance (for 1E6 elements just above 2x the core fn)." [xs]
  (into [] (comp (map (fn [[x n]]
                        ;; the x occurred n times (repeat n x)))
                 cat)
        (sorted-frequencies xs)))


I've seen/implemented various other sorting algorithms in Clojure, and this is by far the fastest and cleanest one i was able to find/write.

;;=================REPL_OUTPUT=============

(use '(criterium.core))

(def data (doall (repeatedly 100 (partial rand-int 5000))))

=> #'treajure.encore/data

(quick-bench (freqsort data))
Evaluation count : 9450 in 6 samples of 1575 calls.
             Execution time mean : 63.675363 µs
    Execution time std-deviation : 1.144335 µs
   Execution time lower quantile : 62.496493 µs ( 2.5%)
   Execution time upper quantile : 65.035663 µs (97.5%)
                   Overhead used : 1.979934 ns
=> nil

(quick-bench (sort data))
Evaluation count : 36642 in 6 samples of 6107 calls.
             Execution time mean : 16.496481 µs *4x faster than `freqsort`*
    Execution time std-deviation : 134.094210 ns
   Execution time lower quantile : 16.361877 µs ( 2.5%)
   Execution time upper quantile : 16.641270 µs (97.5%)
                   Overhead used : 1.979934 ns
=> nil
(def data (doall (repeatedly 1000 (partial rand-int 5000))))

=> #'treajure.encore/data

(quick-bench (freqsort data))
Evaluation count : 678 in 6 samples of 113 calls.
             Execution time mean : 929.610743 µs
    Execution time std-deviation : 83.275723 µs
   Execution time lower quantile : 885.340646 µs ( 2.5%)
   Execution time upper quantile : 1.073217 ms (97.5%)
                   Overhead used : 1.979934 ns

Found 1 outliers in 6 samples (16.6667 %)
    low-severe     1 (16.6667 %)
Variance from outliers : 15.7652 % Variance is moderately inflated by outliers
=> nil

(quick-bench (sort data))
Evaluation count : 1986 in 6 samples of 331 calls.
Execution time mean : 305.248705 µs *3x faster than `freqsort`*
    Execution time std-deviation : 2.572377 µs
   Execution time lower quantile : 303.218408 µs ( 2.5%)
   Execution time upper quantile : 309.553407 µs (97.5%)
                   Overhead used : 1.979934 ns

Found 1 outliers in 6 samples (16.6667 %)
    low-severe     1 (16.6667 %)
Variance from outliers : 13.8889 % Variance is moderately inflated by outliers
=> nil
(def data (doall (repeatedly 10000 (partial rand-int 5000))))

=> #'treajure.encore/data

(quick-bench (freqsort data))
Evaluation count : 54 in 6 samples of 9 calls.
             Execution time mean : 12.557872 ms
    Execution time std-deviation : 976.670436 µs
   Execution time lower quantile : 12.031381 ms ( 2.5%)
   Execution time upper quantile : 14.247191 ms (97.5%)
                   Overhead used : 1.979934 ns

Found 1 outliers in 6 samples (16.6667 %)
    low-severe     1 (16.6667 %)
Variance from outliers : 15.4707 % Variance is moderately inflated by outliers
=> nil

(quick-bench (sort data))
Evaluation count : 150 in 6 samples of 25 calls.
             Execution time mean : 4.058038 ms *3x faster than `freqsort`*
    Execution time std-deviation : 86.703920 µs
   Execution time lower quantile : 3.979264 ms ( 2.5%)
   Execution time upper quantile : 4.159908 ms (97.5%)
                   Overhead used : 1.979934 ns
=> nil
(def data (doall (repeatedly 100000 (partial rand-int 5000))))

=> #'treajure.encore/data

(quick-bench (freqsort data))
Evaluation count : 6 in 6 samples of 1 calls.
             Execution time mean : 132.138750 ms
    Execution time std-deviation : 10.379214 ms
   Execution time lower quantile : 138.162338 ms ( 2.5%)
   Execution time upper quantile : 161.986112 ms (97.5%)
                   Overhead used : 1.979934 ns

Found 1 outliers in 6 samples (16.6667 %)
    low-severe     1 (16.6667 %)
Variance from outliers : 15.2716 % Variance is moderately inflated by outliers
=> nil

(quick-bench (sort data))
Evaluation count : 12 in 6 samples of 2 calls.
Execution time mean : 57.354588 ms *~2x faster than `freqsort`*
    Execution time std-deviation : 1.234770 ms
   Execution time lower quantile : 55.048058 ms ( 2.5%)
   Execution time upper quantile : 58.109885 ms (97.5%)
                   Overhead used : 1.979934 ns
=> nil

;;===========================================


GO CLOJURE :)


Regards,

Jim

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- You received this message because you are subscribed to the Google Groups "Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to