That explains a lot. The link that I sent, DeepMind x UCL | Deep Learning
Lectures | 8/12 | Attention and Memory in Deep Learning - YouTube
<https://www.youtube.com/watch?v=AIiwuClvH6k> , showed attention and window
shifts but I was not able to fully integrate that into my thinking about
Attention and Transformers.
------------------------------------------
Artificial General Intelligence List: AGI
Permalink:
https://agi.topicbox.com/groups/agi/Tefaeb8e790a54cec-M85e5997801258c1e37ec5323
Delivery options: https://agi.topicbox.com/groups/agi/subscription