I think someone else linked to the transformers paper. The best text
compressors in http://mattmahoney.net/dc/text.html use context mixing
algorithms with dictionary preprocessing to replace words with symbols.
Context mixing uses lots of models to predict one bit at a time and neural
networks to combine the predictions for arithmetic coding. Contexts can
start on word boundaries to model the lexical structure of the text,
multi-word boundaries to model grammar, and can skip words to model
semantics. Another technique is to group related words in the dictionary
like "brother" with "sister" and drop the low bits of the symbols from the
context. Related words can be found by proximity or clustering in context
space in the text to be compressed.

The top programs like cmix need 32 GB of RAM and a week to compress 1 GB
because it has thousands of context models. I describe context mixing and
other techniques in http://mattmahoney.net/dc/dce.html



On Sun, Oct 6, 2019, 9:31 PM <[email protected]> wrote:

> Yes I read all of it Matt 2 months ago, it was thrilling to read.
>
> @James, I did intend them both were combined. Above is 3 visualizations,
> each with 2 sticks of a certain length. My point was the size of the data
> you start with is the same if either stick is the original size...the
> actual compression begins when you actually try to compress it, making both
> sticks shorter. Some decompression programs may be larger than others, so
> it was only a idea that if both are evenly long, both may be as short as
> possible.
>
> I noticed patterns mean compression is possible. A certain algorithm like
> in 7zip can compress (at least to a fair amount) seemingly any nonrandom
> txt file fed to it. And a human brain can attempt to compress any given txt
> file to the maximum amount.
> So, while there may not be maximal compression or polynomial solving or
> fully optimal intelligence (calculator VS brain abilities), you can on a
> larger scale have a fully optimal one that works for many problems.
> Indeed it seems so, that just as one given program can't turn a structure
> into anything as fast as possible, a cluster of them can, IOW other
> programs, because often we can't survive with only 1 program, so we end up
> with a cluster of organs, an optimal one. And in fact this begins to become
> a single program on the high level. An optimal one, according to its
> equilibrium state.
>
> "The best text compressors model the lexical, semantic, and syntactic
> structure of natural language. The whole point is to encourage AI research."
> Can you correct me. The previous winner program (basically) predicts the
> Next Word in the 100MB as a lossless predictive program?
> And I noticed you link a paper to Transformers, yous are looking into
> those?
> *Artificial General Intelligence List <https://agi.topicbox.com/latest>*
> / AGI / see discussions <https://agi.topicbox.com/groups/agi> +
> participants <https://agi.topicbox.com/groups/agi/members> + delivery
> options <https://agi.topicbox.com/groups/agi/subscription> Permalink
> <https://agi.topicbox.com/groups/agi/T2d0576044f01b0b1-Mde7dd1ad4d6dc7a11c0f5778>
>

------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T2d0576044f01b0b1-M35e1a709d44aa1fe54e4ce18
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Reply via email to