This is interesting:
https://openjdk.org/legal/ai
OpenJDK Interim Policy on Generative AI
The field of generative AI is evolving quickly. It brings compelling
opportunities to improve developer productivity, but it also brings
risks: to reviewer burden, to safety and security, and to intellectual
property.
Oracle, as the corporate sponsor of the OpenJDK Community, is working
to draft a full policy governing the use of generative AI tools in
OpenJDK contributions. Oracle will propose that policy to the OpenJDK
Governing Board in due course. Until that policy is in place, the
Governing Board has approved this interim policy:
Contributions in the OpenJDK Community must not include content
generated, in part or in full, by large language models, diffusion
models, or similar deep-learning systems. Content, in this
context, includes but is not limited to source code, text, and
images in OpenJDK Git repositories, GitHub pull requests,
e-mail messages, wiki pages, and JBS issues.
Contributors in the OpenJDK Community may use generative AI tools
privately to help comprehend, debug, and review OpenJDK code and
other content, and to do research related to OpenJDK Projects, so
long as they do not contribute content generated by such tools.
This interim policy aims to encourage the use of generative AI tools
in ways that limit their risks while we gain further experience that
will inform the full policy.
Frequently Asked Questions
1.
/What are the risks to reviewer burden of using generative AI tools?/
Generative AI tools, by their nature, make it easy to create large
quantities of plausible-looking code, with plausible-looking
tests, which is nonetheless incorrect or, even if it is correct,
is poorly designed and therefore difficult to maintain. Reviewing
submissions of such code can easily become a drain on the already
limited time of human reviewers. For this reason, some open-source
communities have limited, if not banned, the submission of code
created by generative AI tools.
2.
/What are the risks to safety and security of using generative AI
tools?/
The JDK, developed and maintained in the OpenJDK Community, is the
primary implementation of the Java Platform. It sits at the
foundation of mission-critical systems in businesses, governments,
and other organizations around the world. Safety and security are
paramount. Plausible-looking but incorrect code would put these
critical properties at risk.
3.
/What are the intellectual-property risks of using generative AI
tools?/
The Oracle Contributor Agreement (OCA) requires that a contributor
own the intellectual property rights in each contribution and be
able to grant those rights to Oracle, without restriction. Most
generative AI tools, however, are trained on copyrighted and
licensed content, and their output can include content that
infringes those copyrights and licenses, so contributing such
content would violate the OCA. Whether a user of a generative AI
tool has IP rights in content generated by the tool is the subject
of active litigation.
4.
/Despite these risks, generative AI tools can provide significant
value. Are OpenJDK contributors forbidden from using them altogether?/
No. As the policy says, you are welcome to use such tools to help
comprehend, debug, and review OpenJDK code and other content.
Anecdotal evidence from other communities suggests that analysis
of existing code, rather than creation of new code, is where
generative AI tools shine for established projects with large code
bases. This is consistent with our experience thus far.
5.
/What does it mean to use generative AI tools “privately”?/
The intent of that term is to emphasize that you may use such
tools on your own, without contributing the content that they
generate. It does not mean that you cannot, /e.g/., share and
discuss the output of such tools with a colleague. When sharing
such content, consider adding prominent comments that identify it
as being AI-generated.
6.
/Is it okay to continue using the spell-checking,
grammar-checking, auto-completion, and refactoring features in my
editor or IDE?/
Yes, so long as they are not based on large language models or
similar deep-learning systems.
7.
/Is it okay to use a generative AI tool to review draft JEPs,
JavaDoc, or other documents, so long as I write all of the text
myself?/
Yes. This is clearly a case of using a generative AI tool to
review content, which is fine.
8.
/If I use a generative AI tool to create 100 lines of code, and
then edit ten of those lines myself, may I contribute the result?/
No. Your contribution would still include, in part, AI-generated code.
9.
/Can we improve any of our tooling to help remind contributors of
this policy?/
Yes. We will shortly reconfigure Skara to add a checkbox to the
body of each pull request on GitHub. When you create a pull
request, you must check the box to affirm that your contribution
is in accordance with the policy. More details, including how to
add the checkbox to the body of an existing pull request, are
available in the wiki
<https://wiki.openjdk.org/spaces/SKARA/pages/56524965/FAQ#FAQ-OpenJDKInterimAIPolicy>.
10.
/In an OpenJDK Project, is it okay to add a feature that calls out
to an external AI service?/
That depends upon the service’s terms of use, so this amounts to a
legal question. Be aware that many such terms place strict limits
on how the service may be used. Consult your attorney, or your
employers’ attorney, as appropriate, and make sure that everyone
with a vested interest in your Project has also consulted
appropriate attorneys.
11.
/As a Reviewer in an OpenJDK Project, am I responsible for
detecting when a contributor has submitted code or other content
created with a generative AI tool?/
In this role you are already expected to do your best to ensure
that incoming contributions are consistent with OpenJDK Community
policies and conventions. In general, reliably distinguishing
human-generated content from AI-generated content is impossible.
If, however, you see evidence that content in a contribution was
created with a generative AI tool, then it is your responsibility
to notify the contributor of that fact. If the contributor does
not respond positively and remove the content, please bring that
to the attention of the appropriate Project Lead.
12.
/What are some tell-tale clues of content created by generative AI
tools?/
Sometimes it is obvious, for example when a commit message in the
personal fork from which a contributor initiates a pull request
includes a |Co-Authored-By| trailer line that gives credit to a
specific generative AI tool. Other times it is more subtle, for
example when a contributor’s comments in a pull-request
conversation or an e-mail message are in a chatty, verbose style
inconsistent with their past writing. Other clues include highly
structured comments with multiple headings, unnecessary comments
in code, gratuitously defensive programming, and the use of emoji
characters.
Generative AI tools are evolving rapidly, so clues that are
effective indicators today might not be effective indicators
tomorrow. In general, if something in a pull request seems
uncannily cheerful or meticulous then you could be looking at
AI-generated content.
Gianluca Sartori
--
https://dueuno.com