Re: AI-generated content in Fedora packages: do we have rules?

Simo Sorce Fri, 18 Jul 2025 06:20:21 -0700

On Fri, 2025-07-18 at 13:32 +0200, Florian Weimer wrote:
> * Josh Boyer:
> 
> > On Fri, Jul 18, 2025, 6:45 AM Florian Weimer <fwei...@redhat.com> wrote:
> > 
> >  * Josh Boyer:
> > 
> > > > Do you have a reference to a court in another region that has stated
> > > > an opposite view?  Or perhaps some proposed legislation from
> > > > somewhere?
> > 
> > > No legislation is required because creators' rights do not derive
> > > from an Act of Congress in most parts of the world.  The U.S. notion
> > > that copyright serves purpose to promote progress is completely alien
> > > to other jurisdictions.  If they recognize the right to personal
> > > property at all, they view the right to intellectual property as a
> > > basic human right, growing from exactly the same source as other
> > > property rights.
> > 
> > I find that all well and good for works created by humans.  AI is not
> > a human so it is not obvious to me how the concept of "property" or
> > "rights" apply to output generated by a machine.
> 
> The issue is not just the output.  The way models are built, they
> necessarily embed the creators' works along with their rights.  The
> U.S. legal consensus may indeed be that creators' rights do not matter
> because Congress has not decided they do, but that's not how things work
> elsewhere.  But that won't fly elsewhere because of the way intellectual
> property rights work there.
> 
> If someone listens to a piece of music, transcribes it to musical
> notation, and they or someone else interpret and re-record it from the
> sheet music they produced, the composer still has rights in the
> recording.  From what I can tell, this process is more complex and
> transformative than what training and inference with large language
> models accomplish.  And yet few would argue that the composer loses
> their rights along the way.  If we view the effect of large language
> models differently, that's because we've been conditioned by commercial
> interests (and perhaps to a lesser degree by the anti-copyright folks)
> to do so.
> 
> The most likely outcome is that models will be treated like editing
> software: you can use them to create novel content, or to re-create
> content that others have produced.  Even with traditional editors, it's
> possible that you intend to create something novel, but you
> subconsciously or deliberately reproduce someone else's work.
> Subconscious reproduction may not be much of an issue for code today (at
> least not for copyright purposes), but it already exists for other types
> of content as a very real problem.  Use of models simply exposes this
> challenge to more creators.  I find it very unlikely that there will be
> consensus that these new editors cannot produce infringing content due
> to the way they work.


Well given that copyright protects actual works and not their style or
the ideas behind them I am really hard pressed to consider the risk
high enough to be that concerned.

And I mentioned the US cases previously just as a data point, I am not
saying US case law is the only thing that matters.

But most Copyright laws protect only works made by humans. Mechanical
transformations almost universally so not indeed give rise to new
copyright or shift copyright to other people.

But at the same time a LLM does not just spit copies oc other people
code. IT *can* do that if prompted to do so, but generally I think
there is quite some creative process in instructing the AI on what to
do, and unless you are literally just asking the AI to write the same
piece of code other wrote, the chance of literal copies diminish
quickly, and IMHO they are no different than the chance of you
reproducing the same code others did because it is the natural thing to
do in the specific language for the specific task.

Clearly nobody is going to accuse you of copying their code for having
created a for loop to parse the argv argument of a C program main
function and using the variable "i" as the counter ...

I do not discount the chance an LLM may recreate some code that closely
resembles some other, but then you have to go into a full discussion
about how many lines of code are required to consider it copyright-
able, how different they can or cannot be, and a litany of other tests
that each court may think differently of, even within the same
jurisdiction. And this is the same whether a human wrote it or an AI
did.

Ultimately the questions are:
1. can you effectively police it?
2. How like is this going to be an actual problem (ie what is the level
of risk) ?

To 1) I answer NO.
To 2) I think the answer is "LOW"

I may be wrong, but I do not think making hypothetical contrived cases
is useful, you need to think about the practical cases and likely
consequences IMHO.

Simo.

-- 
Simo Sorce
Distinguished Engineer
RHEL Crypto Team
Red Hat, Inc

-- 
_______________________________________________
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue

Re: AI-generated content in Fedora packages: do we have rules?

Reply via email to