OK, a private copy of the parameters. That makes sense. Your wording perhaps struck me as a little odd because it's in contrast to a model you have which updates private parameters.
I understand the main contrast you're making re. YKY's to be still the point that training is expensive, but interpolation is cheap (famously someone ran some minimal model on a '90s desktop.) You have some kind of online training algorithm yourself? BTW, while I'm commenting, YKY I understand you to be arguing that it should be possible to learn a symbolic abstraction, and that it hasn't been done because the learning needs a refactoring. Which strikes me as a similar idea to startup Symbolica. George Morgan is working on graph re-write rules of some kind. He had been working with Bruno Gavranović for a category theoretic solution, but Morgan forced it back to graph re-rewiting again. From what I understand. YKY might be interested in this summary of Category Theory being applied to ML problems: https://johncarlosbaez.wordpress.com/2025/02/08/category-theorists-in-ai/ Of course I think any attempt to learn a symbolic abstraction will fail because the patterns are chaotic. Though I like Category Theory as capturing some of the variability. On Tue, May 13, 2025 at 12:33 AM Matt Mahoney <[email protected]> wrote: > > > -- Matt Mahoney, [email protected] > > On Sun, May 11, 2025, 9:25 PM Rob Freeman <[email protected]> > wrote: > >> Matt, >> >> What do you mean "each session creates a private copy"? >> > > I mean that your prompts don't update the model. If it did, then > information would leak between unrelated users. After the model is trained, > every user sees the same fixed set of 10 to 100 billion parameters. > > Training and prediction costs about the same. Training costs a few million > dollars per trillion tokens. Prediction costs a few dollars per million > tokens. > > My models don't work that way. It predicts the next bit of text, then > updates the model in proportion to the prediction error. It is more > accurate because it has the most up to date information. For online LLM > services to do this, it would have to make a private copy of the > parameters. Some services charge a lower rate for cached input, so maybe > they are doing this instead of using a large context window. This would be > closer to the way the brain works and more economical IMHO. > >> *Artificial General Intelligence List <https://agi.topicbox.com/latest>* > / AGI / see discussions <https://agi.topicbox.com/groups/agi> + > participants <https://agi.topicbox.com/groups/agi/members> + > delivery options <https://agi.topicbox.com/groups/agi/subscription> > Permalink > <https://agi.topicbox.com/groups/agi/Tdc5c19d0f38aacd6-Md357d2bb532c783380060925> > ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/Tdc5c19d0f38aacd6-M4a1e4640bf2dd70bdf85e414 Delivery options: https://agi.topicbox.com/groups/agi/subscription
