Manning's SEQUITUR is one of those innovations that strikes me as
fundamental.  It's incredibly fast and low complexity for its
effectiveness.  This, to me, indicates nature probably somehow incorporates
it as a neocortical hardware primitive.  The important question I've not
seen addressed in the grammar compression literature is the exact --
searching for a word -- "phase transition" boundary between SEQUITUR and
backtracking.

On Thu, May 28, 2020 at 9:04 AM Matt Mahoney <[email protected]>
wrote:

> Your idea is similar to
> Compression by induction of hierarchical grammars
> by Craig G. Nevill-Manning,
> Ian H . Witten & David L. Maulsby,
> October, 1993
>
> It looks promising but I don't know of any practical implementations. You
> have to make multiple passes to find the most frequent pair of symbols, but
> it is faster (but suboptimal) to find many frequent pairs at once in fewer
> passes.
>
> On Thu, May 28, 2020, 8:58 AM stefan.reich.maker.of.eye via AGI <
> [email protected]> wrote:
>
>> I think I got the theoretical runtime down to O(n) if we skip sorting the
>> literal lines in the output file (which doesn't make any difference to the
>> core idea).
>>
>> So the algorithm basically compresses a list of ints to a compact binary
>> graph in O(n) time. It does so like this:
>>
>> while true:
>>   (i, j) = most often (and at least twice!) occurring adjacent pair in
>> list; end loop if none found
>>   k = index of new or existing table entry (i, j)
>>   replace all occurrences of (i, j) in list with (k)
>>
>> Each step of the loop can be done in constant time by clever use of hash
>> maps and linked lists. That almost sounds impossible, but there we are,
>> just finished the code. In practical terms, for enwik8, it's just about as
>> fast than the last version that was more O(n log n)-ish.
>>
>> I'm truly wondering if this algorithm has been discovered before.
>>
> *Artificial General Intelligence List <https://agi.topicbox.com/latest>*
> / AGI / see discussions <https://agi.topicbox.com/groups/agi> +
> participants <https://agi.topicbox.com/groups/agi/members> + delivery
> options <https://agi.topicbox.com/groups/agi/subscription> Permalink
> <https://agi.topicbox.com/groups/agi/Tb2cf064c700f181c-M8cdf4daa6a5627de0e2fb67e>
>

------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/Tb2cf064c700f181c-M920dc2a0fd0d32d63424f9fe
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Reply via email to