Daniel P. Berrangé <berra...@redhat.com> writes:

> On Wed, Jun 04, 2025 at 10:58:38AM +0200, Markus Armbruster wrote:
>> Daniel P. Berrangé <berra...@redhat.com> writes:
>> 
>> > On Wed, Jun 04, 2025 at 08:17:27AM +0200, Markus Armbruster wrote:
>> >> Stefan Hajnoczi <stefa...@gmail.com> writes:
>> >> 
>> >> > On Tue, Jun 3, 2025 at 10:25 AM Markus Armbruster <arm...@redhat.com> 
>> >> > wrote:
>> >> >>
>> >> >> From: Daniel P. Berrangé <berra...@redhat.com>
>> >  >> +
>> >> >> +The increasing prevalence of AI code generators, most notably but not 
>> >> >> limited
>> >> >
>> >> > More detail is needed on what an "AI code generator" is. Coding
>> >> > assistant tools range from autocompletion to linters to automatic code
>> >> > generators. In addition there are other AI-related tools like ChatGPT
>> >> > or Gemini as a chatbot that can people use like Stackoverflow or an
>> >> > API documentation summarizer.
>> >> >
>> >> > I think the intent is to say: do not put code that comes from _any_ AI
>> >> > tool into QEMU.
>> >> >
>> >> > It would be okay to use AI to research APIs, algorithms, brainstorm
>> >> > ideas, debug the code, analyze the code, etc but the actual code
>> >> > changes must not be generated by AI.
>> >
>> > The scope of the policy is around contributions we receive as
>> > patches with SoB. Researching / brainstorming / analysis etc
>> > are not contribution activities, so not covered by the policy
>> > IMHO.
>> 
>> Yes.  More below.
>> 
>> >> The existing text is about "AI code generators".  However, the "most
>> >> notably LLMs" that follows it could lead readers to believe it's about
>> >> more than just code generation, because LLMs are in fact used for more.
>> >> I figure this is your concern.
>> >> 
>> >> We could instead start wide, then narrow the focus to code generation.
>> >> Here's my try:
>> >> 
>> >>   The increasing prevalence of AI-assisted software development results
>> >>   in a number of difficult legal questions and risks for software
>> >>   projects, including QEMU.  Of particular concern is code generated by
>> >>   `Large Language Models
>> >>   <https://en.wikipedia.org/wiki/Large_language_model>`__ (LLMs).
>> >
>> > Documentation we maintain has the same concerns as code.
>> > So I'd suggest to substitute 'code' with 'code / content'.
>> 
>> Makes sense, thanks!
>> 
>> >> If we want to mention uses of AI we consider okay, I'd do so further
>> >> down, to not distract from the main point here.  Perhaps:
>> >> 
>> >>   The QEMU project thus requires that contributors refrain from using AI 
>> >> code
>> >>   generators on patches intended to be submitted to the project, and will
>> >>   decline any contribution if use of AI is either known or suspected.
>> >> 
>> >>   This policy does not apply to other uses of AI, such as researching 
>> >> APIs or
>> >>   algorithms, static analysis, or debugging.
>> >> 
>> >>   Examples of tools impacted by this policy includes both GitHub's 
>> >> CoPilot,
>> >>   OpenAI's ChatGPT, and Meta's Code Llama, amongst many others which are 
>> >> less
>> >>   well known.
>> >> 
>> >> The paragraph in the middle is new, the other two are unchanged.
>> >> 
>> >> Thoughts?
>> >
>> > IMHO its redundant, as the policy is expressly around contribution of
>> > code/content, and those activities as not contribution related, so
>> > outside the scope already.
>> 
>> The very first paragraph in this file already set the scope: "provenance
>> of patch submissions [...] to the project", so you have a point here.
>> But does repeating the scope here hurt or help?
>
> I guess it probably doesn't hurt to have it. Perhaps tweak to
>
>  This policy does not apply to other uses of AI, such as researching APIs or
>  algorithms, static analysis, or debugging, provided their output is not
>  to be included in contributions.
>
> and for the last paragraph remove 'both' and add a tailer
>
>    Examples of tools impacted by this policy include GitHub's CoPilot,
>    OpenAI's ChatGPT, and Meta's Code Llama (amongst many others which are less
>    well known), and code/content generation agents which are built on top of
>    such tools.

Sold!


Reply via email to