[agi] Re: More Data

immortal . discoveries Sun, 24 May 2020 02:25:30 -0700

My Reddit thread:
https://www.reddit.com/r/agi/comments/gmmrbr/a_natural_and_explainable_brain_my_design/



The world record compression of enwik8 (100MB, mostly text) is 14.8MB. GPT-2, 
by looking at openAI’s own benchmark, [could] compress enwik8 to about maybe 
12MB. My compressor explained in the video is fully understood and gets a score 
of 21.8MB. It has already been shown in algorithms in the Hutter Prize that 
grouping words mom/father help compression a Great amount and so does boosting 
recently activated letters or better yet words if they haven’t tried so 
(similar words are likely to occur again soon), which I have yet to add to my 
letter predictor. So I well understand how to get a score of about 16MB at 
least. I think that puts a lot of point into my points. So why don’t others 
explain GPT-2 or Transformers like this? Why all the hard blackbox algebra?

Another thing my current code doesn’t do yet is robust exact matching for 
typos, and time delay similarity of positions, so “hor5e the” should partially 
activate “that zebra” because horse = zebra some amount and has similar 
position in order, convolutionarily heard up to higher layers.

I have a lot of other advanced ideas too, and there is pre-compression factors 
too like sorting/Transforming enwik8 articles before training on it that help 
prediction.
------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T32d495083e92cc87-M3b7f75b1bbf975552546492f
Delivery options: https://agi.topicbox.com/groups/agi/subscription

[agi] Re: More Data

Reply via email to