baggle

BAGA

ummmmmmmmmmmmmmmmm

so. there are language models that ummm produce many tokens at once!
maybe these could run more effectively on embedded systems! likely so!

ummmmmmmm so if you can produce n tokens at once, then that amortizes
the cost of going through all your layers. it would make it cheaper to
offload them!

Reply via email to