Sorry that should be: (def represent (memoize represent))
On Feb 20, 5:18 pm, Jules <julesjac...@gmail.com> wrote: > Here is my solution that solves the problem optimally. Observe that > given a sequence we can do one of two things: (1) represent it as a > repetition or (2) represent it as a concatenation of repetitions. If > we had two functions to calculate all the ways of doing (1) and (2) we > could solve the problem as follows: > > (defn represent [s] (apply min-key cost (concat (reps s) (concats > s)))) > > That is, we pick out the minimum cost way of representing s as a > repetition or as a concatenation of repetitions, where reps returns > all repetitions and concats returns all concatenations. > > The cost function to count the number of symbols is: > > (defn cost [repr] (apply + (map #(count (% 1)) repr))) > > Of course you can use another cost function. > > The function reps can be defined as follows: > > (defn reps [s] > (map vector (filter #(= s (apply concat (repeat (first %) (second > %)))) > (map #(vector (/ (count s) %) (take % s)) (range 1 (+ 1 > (count > s))))))) > > For example, reps on ababab gives: [3 ab] and [1 ababab]. Note that > this algorithm is very naive: it tries all repetitions [n (take n s)] > and eliminates invalid ones. The code may look involved but the > algorithm is rather simplistic. > > The function concats is as follows: > > (defn concats [s] > (map #(apply concat (map represent (split-at % s))) (range 1 (- > (count s) 1)))) > > It works like this. We split s at all possible boundaries: ababab => a > babab, ab abab, aba bab, abab ab, ababa b. Then we recursively > calculate the optimal representation of each of the pairs, and > concatenate the representations. This gives us all ways to represent s > as a concatenation of optimal representations. > > That is all it takes :) > > (represent '(a b b a b b a)) => ([1 (a)] [2 (b b a)]) > > Unfortunately the algorithm is exponential time. We can fix this by > memoizing represent: > > (memoize represent) > > Now the algorithm is polynomial time. It is still rather slow for > large sequences (but there is a lot of room for optimization), and > gives stack overflows. You can fix this by eliminating the recursion > and filling the memoization table iteratively. > > Jules -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en