[cctalk] Re: Open source a panacea?

Tony Jones via cctalk Mon, 03 Feb 2025 13:09:26 -0800

On Mon, Feb 3, 2025 at 12:45 PM Alexander Schreiber via cctalk <
cctalk@classiccmp.org> wrote:

> On Mon, Feb 03, 2025 at 07:08:32PM -0000, Donald Whittemore via cctalk
> wrote:
>
> On top of that: A lot of those LLMs are build on theft at an epically large
> scale. They hovered up everything in sight (and then some) without even
> pretending to care about intellectual property rights - e.g. the NY Times
> has taken OpenAI to court because they managed to make the OpenAI LLMs
> spit out long verbatim fragments of NY Times content. The hilarious part
> is that DeepSeek essentially stole from OpenAI that which OpenAI previously
> stole from everyone else and OpenAI is very angry about the lack of honor
> among thieves or something ;-)
>

My understanding was that OpenAI accused DeepSeek of "distilling" their
model.  Via presumably making API queries to OpenAIs service. However
normally 'distillation" is the process of generating a smaller ("student")
model from a larger ("teacher") model except in this case DeepSeek
apparantly created something more of a peer to the teacher.   Maybe there
was some "veneer" final training but the basic assertion of "they stole our
work" is probably more of OpenAI trying to control the narrative.   Now
whether DeepSeek stole N different entities IP,  that's a different
question.    As you said there is no way to reproduce the model,   so
what's on github isn't "open source" in most peoples understanding.
Still it's better than Microsoft/OpenAI where the model is "closed" behind
an API.

[cctalk] Re: Open source a panacea?

Reply via email to