karl3ļ¼ writeme.com wrote: > so http://github.com/karl3wm/httptransformer now has a resuming class > > i can load llama 405b and process 0.00002% of it on my tiny system, then > reboot the system and start right off again at 0.00002% and process up to > 0.00004% ! :D > > it doesn't do any paralleliz--
i tried it on google colab, but it actually ran at about the same speed because of the synchronous streaming nature -- so there's some interest in focusing on speeding up the network portion, the biggest bottleneck
