[Wikimedia-l] Re: 23 March: Invitation to Open Community Call on ChatGPT, generative AI, and Wikimedia

Erik Moeller Fri, 31 Mar 2023 07:05:48 -0700

On Thu, Mar 30, 2023 at 12:25 PM Lauren Worden <[email protected]> wrote:
> > If you don't obtain this agreement, you cannot meaningfully enforce
> > the "license" because the downloader never agreed to it in the first
> > place. Moreover, you'll have to make sure that _everyone else making
> > copies of the file_ also obtains agreement from people getting those
> > copies, or your whole house of cards falls down.


> Isn't that exactly how we impose attribution and share-alike
> requirements of CC-BY-SA content?

Not exactly. CC-BY-SA gives Wikimedia readers permissions they would
not otherwise have (e.g., to distribute copies), and it ties those
permissions to certain obligations (e.g., attribution). Readers who do
not wish to exercise those additional permissions are not required to
adhere to the obligations. They'd just be limited to what copyright
law lets you do with content you download from a public website.
Nobody can stop you from making your own offline version of Wikipedia,
calling it "Bobbypedia", and removing all other attribution -- as long
as you keep it to yourself.

To be sure, you can put restrictions in an AI model license that kick
in for folks distributing the model, which is something they wouldn't
legally be able to do without consulting and agreeing to the licensing
terms. But, crucially, you don't have to distribute an AI model to run
it. Most of the unethical uses folks tend to worry about (e.g., bulk
generation of misinformation) do not involve distributing copies of
the model, only of its output.

If you want to impose ethical use restrictions on people running your
AI models, you have two choices: You can require everyone getting a
copy of the model by any means to explicitly agree to those
restrictions (presumably Facebook does this when distributing LLaMA to
researchers), or you can make your model freely available and protest
ineffectually when a downloader ignores the restrictions you've
spelled out in a textfile in your repository. Neither approach is
compatible with open source.

> I have no particular affinity to BLOOM, but I have been able to
> personally test that it is capable of at least a dozen different use
> cases that people have shown GPT-3 and ChatGPT can be used for on
> enwiki.

I think it's fine to explore all sorts of models, free and nonfree,
for the purpose of assessing capabilities and mitigating risks. When
it comes to deployment of models in a production context, IMO
Wikimedia should exclude from consideration any models under
ill-conceived "ethical use" licenses.

Warmly,
Erik
_______________________________________________
Wikimedia-l mailing list -- [email protected], guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/4XZMYBMH7XESK23KWPFTBXKM7R2H4DJR/
To unsubscribe send an email to [email protected]

[Wikimedia-l] Re: 23 March: Invitation to Open Community Call on ChatGPT, generative AI, and Wikimedia

Reply via email to