OK, a private copy of the parameters. That makes sense. Your wording
perhaps struck me as a little odd because it's in contrast to a model you
have which updates private parameters.

I understand the main contrast you're making re. YKY's to be still the
point that training is expensive, but interpolation is cheap (famously
someone ran some minimal model on a '90s desktop.)

You have some kind of online training algorithm yourself?

BTW, while I'm commenting, YKY I understand you to be arguing that it
should be possible to learn a symbolic abstraction, and that it hasn't been
done because the learning needs a refactoring. Which strikes me as a
similar idea to startup Symbolica. George Morgan is working on graph
re-write rules of some kind. He had been working with Bruno Gavranović for
a category theoretic solution, but Morgan forced it back to graph
re-rewiting again. From what I understand.

YKY might be interested in this summary of Category Theory being applied to
ML problems:

https://johncarlosbaez.wordpress.com/2025/02/08/category-theorists-in-ai/

Of course I think any attempt to learn a symbolic abstraction will fail
because the patterns are chaotic. Though I like Category Theory as
capturing some of the variability.

On Tue, May 13, 2025 at 12:33 AM Matt Mahoney <[email protected]>
wrote:

>
>
> -- Matt Mahoney, [email protected]
>
> On Sun, May 11, 2025, 9:25 PM Rob Freeman <[email protected]>
> wrote:
>
>> Matt,
>>
>> What do you mean "each session creates a private copy"?
>>
>
> I mean that your prompts don't update the model. If it did, then
> information would leak between unrelated users. After the model is trained,
> every user sees the same fixed set of 10 to 100 billion parameters.
>
> Training and prediction costs about the same. Training costs a few million
> dollars per trillion tokens. Prediction costs a few dollars per million
> tokens.
>
> My models don't work that way. It predicts the next bit of text, then
> updates the model in proportion to the prediction error. It is more
> accurate because it has the most up to date information. For online LLM
> services to do this, it would have to make a private copy of the
> parameters. Some services charge a lower rate for cached input, so maybe
> they are doing this instead of using a large context window. This would be
> closer to the way the brain works and more economical IMHO.
>
>> *Artificial General Intelligence List <https://agi.topicbox.com/latest>*
> / AGI / see discussions <https://agi.topicbox.com/groups/agi> +
> participants <https://agi.topicbox.com/groups/agi/members> +
> delivery options <https://agi.topicbox.com/groups/agi/subscription>
> Permalink
> <https://agi.topicbox.com/groups/agi/Tdc5c19d0f38aacd6-Md357d2bb532c783380060925>
>

------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/Tdc5c19d0f38aacd6-M4a1e4640bf2dd70bdf85e414
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Reply via email to