On Wed, Dec 29, 2010 at 6:41 PM, Mark Engelberg <mark.engelb...@gmail.com> wrote: > On Wed, Dec 29, 2010 at 1:29 PM, Andreas Kostler > <andreas.koestler.le...@gmail.com> wrote: >> >> On 30/12/2010, at 2:39 AM, Mark Engelberg wrote: >> >>> I recommend reading: >>> http://htdp.org/2003-09-26/Book/curriculum-Z-H-16.html#node_sec_12.2 >>> for a detailed discussion of how to design and write insertion sort in >>> a functional language, using linked lists and recursion. >> That's what I was looking for. However, Ken's version was a real eye opener >> :) >> Thanks > > Andreas, > > Unfortunately, Ken's version doesn't work on large lists, and suffers > from the nested-laziness problem that everyone's been talking about. > It's worth studying his code, though, because it's actually a great > illustration of how you have to be super-careful when working with > laziness, and how laziness sometimes makes it very hard to determine > runtime performance characteristics. > > Using his code, try (first (insertion-sort (range 10000))). I'm using > Sun's JVM, in -server mode, and this generates a stack overflow. > > Aside from that problem, there are a few other details worth > mentioning about his version. The split-with causes the first part of > the list (up to the insertion point) to be traversed twice, which is > somewhat more inefficient than it needs to be. BTW, the use of concat > is where the nested laziness problem creeps in. His clever use of > reduce unfortunately makes this not a "stable sort", because elements > that compare as equal will end up having their orders reversed. > (Doesn't matter if you're only sorting numbers, but could matter if > you're sorting records based on some field). > > So again, I encourage you to first start by porting the Htdp version > of to Clojure. It's always good to start with the clearest, most > elegant "classic" implementation. Here's how it looks (adapted to > sort in increasing, rather than decreasing order): > > (defn insert [n alon] > (cond > (empty? alon) (cons n nil) > (<= n (first alon)) (cons n alon) > (> n (first alon)) (cons (first alon) (insert n (rest alon))))) > > (defn isort [alon] > (cond > (empty? alon) nil > (seq? alon) (insert (first alon) (isort (rest alon))))) > > This is a fairly literal translation to Clojure. It works, but in > Clojure, generates a stack overflow on large lists. > Now, the question you should ask yourself is: > What are the minimal changes I can make to this code to eliminate the > stack overflow problem? > > This will be a valuable exercise that will help you with all Clojure > code you write later, so make sure you can answer this question before > trying to get all fancy :) Ask if you need further assistance...
I was demonstrating the most concise, idiomatic expression of the algorithm, rather than the most useful for, say, large lists. As for making it stable, changing (reduce insert-into [] s) to (reduce insert-into [] (reverse s)) should suffice, though it will traverse s one more time. (And if split-with traverses the input list twice, that's an implementation issue with split-with. Perhaps it could be rewritten to be more efficient.) For production use I'd just use the existing sorted-foo structures, the java.util sort, or implement quicksort or mergesort, but if I had to use insertion sort, something like this would probably be moderately efficient and work on large inputs: (defn insert-into [s x] (loop [s (seq s) o [] xin? false] (if s (let [y (first s)] (if xin? (recur (next s) (conj o y) true) (if (< y x) (recur (next s) (conj o y) false) (recur (next s) (conj (conj o x) y) true)))) (if xin? o (conj o x))))) (defn insertion-sort [s] (reduce insert-into [] (reverse s))) This is a stable sort and works for large inputs, though it's quadratic. (Bubble sort can probably be made faster since it can take advantage of transient.) Memory use stays bounded and fairly low. user=> (def q (Integer. 280)) #'user/q user=> (def r (Integer. 280)) #'user/r user=> (identical? q r) false ; so, we can tell these two apart user=> (insertion-sort [3 17 360 q 23 r 42]) [3 17 23 42 280 280 360] user=> (identical? (nth (insertion-sort [3 17 360 q 23 r 42]) 4) q) true ; the sort is stable user=> (nth (insertion-sort (reverse (range 100000))) 77512) 77512 ; 15 whole minutes after being submitted, natch ; but, no stack overflow -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en