Re: AI-generated content in Fedora packages: do we have rules?

Simo Sorce Thu, 17 Jul 2025 12:18:39 -0700

On Thu, 2025-07-17 at 19:29 +0100, Daniel P. Berrangé wrote:
> On Wed, Jul 16, 2025 at 04:09:41PM -0400, Stephen Gallagher wrote:
> > I'm thinking we should treat AI-generated code the same way that we
> > would treat sub-contracted code. I've worked at companies that
> > outsourced some software-development to subcontracting companies. The
> > way this would generally work is that there would be an on-site
> > coordinator that submitted all of the code on behalf of the (most
> > likely underpaid) coders working elsewhere. The way this was
> > interpreted is that the coordinator, as a representative of the
> > subcontracting company, was taking on the responsibility (and
> > accountability) for verifying that the content being submitted is
> > functional, non-malicious and not *known to be* violating anyone's
> > copyright. If later it turned out that someone on their team was
> > stealing code, the person whose name was on the commit would be held
> > responsible for that violation.
> > 
> > I think we can realistically only hold generative AI submissions to
> > roughly this same standard: we already trust our contributors to do
> > their due-diligence. They remain responsible for what the code they
> > submit does (and will be held accountable for it if it's malicious or
> > violates copyrights and patents).
> 
> In practical terms, how can a contributor do due-diligence on the
> output of an AI generator ? The vast size of training material makes
> it hard, if not impossible validate the license & copyright
> compliance of non-trivial code. Some tools claim to validate their
> output for compliance in some manner, but what that actually means
> is hard to find out & the reliablity of such claims is unclear.
> 
> In the case of a human sub-contractor (or any 3rd party you acquired
> a patch via) you can expect to have a number practical ways to do
> useful "due diligence" to gain confidence in the code you receive.
> Particularly if you work with someone over time you can increasingly
> build a strong trust relationship.
> 
> This is materially different to a relationship with an AI which
> (at least with common tools today) can be more than a little bit
> inconsistent and unpredictable on an ongoing basis, making it
> hard to build up trust over time to the extent you would with
> a person.
> 
> NB for trivial code the situation is somewhat different & simpler,
> as you can likely make a claim that trivial changes won't meet
> the threshold for copyright protection, whether from a human or
> AI.
> 
> >                                   And, frankly, there is very little
> > way we can detect if the code was AI-generated or written by a human
> > being. If we tried to make rules against GenAI, the practical effect
> > will be that people will just stop including notes telling us about
> > it. Discouraging transparency won't improve the situation at all.
> 
> Given the direction of the tech industry in general wrt AI, IMHO,
> the absence of a written policy on AI generated contribution, will
> effectively soon imply a policy that the project accepts any & all
> use of AI.
> 
> IOW, not making a decision on AI is effectively making a decision
> on AI.
> 
> Maybe that is none the less right for Fedora, I can't say ? I
> think it important that we spend time to debate & investigate the
> different aspects, so we can judge whether we benefit from  an
> explicit policy (whether it says to allow or deny or delegate it
> to contributors' judgement, or something else) or not. 
> 
> 
> We may not be able to detect all AI based contributions, but I don't
> think that having such an ability needs to be a pre-requisite for
> defining an explicit policy, whatever it may state.
> 
> The operation of Fedora, and OSS communities in general, is heavily
> reliant on trust built up organically between participants over time,
> and IMHO Fedora does fairly well in this respect.
> 
> If we come up with a well reasoned policy on AI, we should broadly
> be able to expect our contributors to abide by it.
> 
> If a small minority don't, they'll have to accept the consequences
> if someone notices, but if a large set don't then it could be a
> sign the policy was misguided & needs revisiting, which would at
> least have been a learning experiance for us all.
> 
> With regards,
> Daniel


In my opinion the situation is simple, as already several courts
hinted, the output of an AI cannot be copyrighted, and that makes sense
given Copyright hinges on protecting human creativity and AIs clearly
are not human. So Fedora could make a decision that the default license
for AI generated code is just "Public Domain".

Unless there is *significant* work by the Developer around both the
prompting and adjusting of the output.

Additionally a good policy would be to ensure the specific AI used is
mentioned, so things can be reviewed later in light of the models used.

Whatever is done please keep the friction low, because otherwise people
will simply lie to you and nothing useful will come out of it.

--
Simo Sorce
Distinguished Engineer
RHEL Crypto Team
Red Hat, Inc

-- 
_______________________________________________
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue

Re: AI-generated content in Fedora packages: do we have rules?

Reply via email to