> On Feb 3, 2025, at 3:40 PM, Alexander Schreiber via cctalk 
> <cctalk@classiccmp.org> wrote:
> 
> ...
> On top of that: A lot of those LLMs are build on theft at an epically large
> scale. They hovered up everything in sight (and then some) without even
> pretending to care about intellectual property rights - e.g. the NY Times
> has taken OpenAI to court because they managed to make the OpenAI LLMs
> spit out long verbatim fragments of NY Times content. The hilarious part
> is that DeepSeek essentially stole from OpenAI that which OpenAI previously
> stole from everyone else and OpenAI is very angry about the lack of honor
> among thieves or something ;-)

Excellent point.  I tend to refer to LLMs as "derived work generators" to point 
out the copyright problems that are fundamental to what they do.

I also tend to wonder about web hoovering as a training scheme, given that a 
lot of web content is fiction.  And I don't mean "misinformation", I just mean 
novels and the like.  What happens to an LLM that inhales "The Martian" or 
"Ringworld" ?

        paul


Reply via email to