On Mon, Feb 3, 2025 at 12:45 PM Alexander Schreiber via cctalk < cctalk@classiccmp.org> wrote:
> On Mon, Feb 03, 2025 at 07:08:32PM -0000, Donald Whittemore via cctalk > wrote: > > On top of that: A lot of those LLMs are build on theft at an epically large > scale. They hovered up everything in sight (and then some) without even > pretending to care about intellectual property rights - e.g. the NY Times > has taken OpenAI to court because they managed to make the OpenAI LLMs > spit out long verbatim fragments of NY Times content. The hilarious part > is that DeepSeek essentially stole from OpenAI that which OpenAI previously > stole from everyone else and OpenAI is very angry about the lack of honor > among thieves or something ;-) > My understanding was that OpenAI accused DeepSeek of "distilling" their model. Via presumably making API queries to OpenAIs service. However normally 'distillation" is the process of generating a smaller ("student") model from a larger ("teacher") model except in this case DeepSeek apparantly created something more of a peer to the teacher. Maybe there was some "veneer" final training but the basic assertion of "they stole our work" is probably more of OpenAI trying to control the narrative. Now whether DeepSeek stole N different entities IP, that's a different question. As you said there is no way to reproduce the model, so what's on github isn't "open source" in most peoples understanding. Still it's better than Microsoft/OpenAI where the model is "closed" behind an API.