Sorry that should be:
(def represent (memoize represent))
On Feb 20, 5:18 pm, Jules wrote:
> Here is my solution that solves the problem optimally. Observe that
> given a sequence we can do one of two things: (1) represent it as a
> repetition or (2) represent it as a concatenation of repetition
Here is my solution that solves the problem optimally. Observe that
given a sequence we can do one of two things: (1) represent it as a
repetition or (2) represent it as a concatenation of repetitions. If
we had two functions to calculate all the ways of doing (1) and (2) we
could solve the problem
Very interesting solution. I implemented an LZ77 like solution, which
appears to be working, but its good to have alternative solutions. As
you point out, different match patterns might be interesting. I tune
the matching by disallowing small matches based on length, but the
pattern string would be
Interesting. This problem is like compression, but the first thing I
thought of was pattern recognition. If your tokens are just characters
in a string, you can let regular expressions do all the heavy lifting:
(defn rep-groups [s] (re-seq #"(.+)\1+|." s))
Then (rep-groups ".,ababcababc.,cc,.abab
Thanks for the hint on LZ77. That was exactly what I was looking for.
Thank you also to the others who provided this and other answers.
Thanks!
-Mark
On Feb 13, 1:00 pm, Mark Engelberg wrote:
> This is essentially a compression problem. I think you want to research the
> topic of "Run-length
DEFLATE is a compression method that combines LZ77 with Huffman
coding. It is implemented in java.util.zip.
Read about it here:
http://download.oracle.com/javase/1.4.2/docs/api/java/util/zip/package-summary.html
fpc
--
You received this message because you are subscribed to the Google
Groups "
On Feb 13, 2011, at 10:11 AM, Mark Fredrickson wrote:
Hello friends,
I am writing a program to generate directions for humans to read. The
directions are composed of a collection with a series of steps, some
of which may be duplicated:
[:a :b :c :a :b :c]
I would like to compress them by ind
This is essentially a compression problem. I think you want to research the
topic of "Run-length encoding", and then look at algorithms like LZ77 which
handle runs of repeated characters. LZ77 finds repeated blocks that are
some distance apart; so, for example, you could restrict the algorithm to
Hello friends,
I am writing a program to generate directions for humans to read. The
directions are composed of a collection with a series of steps, some
of which may be duplicated:
[:a :b :c :a :b :c]
I would like to compress them by indicating repeats, using this
notation:
[[2 [:a :b :c]]
Si