On Mon, 5 May 2025 at 01:05, Wouter Verhelst <wou...@debian.org> wrote:

> On Sun, May 04, 2025 at 07:08:00PM +0200, Aigars Mahinovs wrote:
> >    On Sun, 4 May 2025 at 17:30, Wouter Verhelst <[1]w...@uter.be> wrote:
> >
> >      >    Wikipedia definition is a layman's simplification.
> >      It may be a simplification, but that in and of itself does not make
> >      it
> >      incorrect.
> >
> >    I have specifically addressed this point with examples in my reply.
> >    Copyright very clearly does not survive learning and then generation
> of
> >    new solutions. In humans that is a given.
>
> Indeed.
>
> >    For software I would assume the equivalence, unless proven
> >    differently.
>
> This is not a fact; this is your opinion. You base the rest of your
> argument on it, so I'll call it an axiom: something to accept in order
> for the rest of the argument to hold.
>
> The problem is, I disagree with your axiom.
>
> To me, software and humans are two very different things. We know how
> computers work; we can therefore reason what the output of a software
> program is going to be based on the input that you give it. Whether that
> program is a compiler or a trainer program for a deep learning model is
> just a detail in that context. One computer chip of a given model and
> stepping is 100% equivalent to another, and so any process that runs on
> one of these chips will produce the same output on another.
>
> The same is not true for human brains; we do not fully understand how
> they work, we cannot predict what the resulting experience of a given
> person is going to render based on the training that person has
> received, and therefore we cannot predict how a given person is going to
> write a particular piece of software.


Theoretically we could predict both, if we knew all the inputs and all the
algorithms. We just practically do not know that for most humans. Following
this logic to the extreme would make all human-written software to be
non-free, because we can not reproduce the training inputs and the model
software itself has no known source code. /s

But we don't need to go as far because the copyright law fair use does not
require the transformation to be non-deterministic to be transformative. Or
carried out by a human.

A simple, very deterministic and clearly fully automatic and software-only
creation of thumbnails for search engines is ruled as fair use. So is very
simple full text indexing of books by Google Books and then providing
direct snipplers of those materials to the end users.

The transformative criteria here is that the resulting work needs to be
transformed in such a way that it adds value. And generating new texts from
a LLM is pretty clearly a value-adding transformation compared to the
original articles. Even more so than the already ruled-on Google Books case.
-- 
Best regards,
    Aigars Mahinovs

Reply via email to