Since proposals which don't fit in to existing discussions elsewhere are on
topic here, I want to boldly recommend the following while the annual
planning process is still ongoing, because it's far beyond the scope of
what could be accomplished at a hackathon or on WMCS in a responsible
fashion:

First, the Foundation should host a fork of BLOOM [
https://huggingface.co/bigscience/bloom ], which if I remember correctly
was described by the Foundation's Machine Learning Director Chris Albon as
the only LLM at the scale of GPT-3 adhering to the movement's FOSS
criteria. This should be done under or alongside Toolforge on Wikimedia
Cloud Services so that staff and volunteers alike may use its API and
submit modification proposals for new instances. Presumably this would cost
on the order of $100,000 per year per instance, according to
https://huggingface.co/bigscience/bloom/discussions/161#63a33373b5fc9ab9f63d97f7
but someone should double-check that math. I've tested BLOOM against a
dozen of the uses shown around enwiki for GPT-3 and ChatGPT, and it seems
to perform about as well. (You can use the Hosted Inference API version on
Azure for free at the Huggingface URL.)

Secondly, the Foundation should sponsor staff-, grant-, affiliate-, and
volunteer-run projects to replicate and extend the work on:

A. RARR [ https://arxiv.org/abs/2210.08726 ] and other methods of
attribution and verification with goals aspiring to Wikipedia's standards
of summarizing and citing sources in ways that can be independently
verified.

B. ROME [ https://rome.baulab.info/ / MEMIT: https://memit.baulab.info/ ]
and other approaches to knowledge editing in language models with the goal
of producing simple interfaces to provide "language models that anyone can
edit" and ideally coupled to Wikidata updates.

C. EditEval [ https://eval.ai/web/challenges/challenge-page/1866/overview
], an ongoing challenge competition to produce systems capable of
automatically improving text, including its fluency, simplification,
paraphrasing, neutralization, and updating information.

I apologize to those on Thursday's Zoom call who had proposals for ORES
expansion to combat paid advocacy, images, audio, speech and video, as I
don't remember enough of the details and there's not enough information at
https://meta.wikimedia.org/wiki/Wikimedia_Foundation_Annual_Plan/2023-2024/Draft/External_Trends/Community_call_notes
to include them here. I hope the advocates will elucidate those proposals
on list while the annual planning process is still in progress.

-LW

On Mon, Mar 27, 2023 at 1:04 PM Yael Weissburg <[email protected]>
wrote:

> Hello again everyone,
>
> Thanks again to those who made it to the call last week - it felt like
> such a luxury to be able to drop deeply into this subject for an hour
> (plus) with all of you.
>
> For those who were unable to join, we captured extensive notes on Meta
> <https://meta.wikimedia.org/wiki/Wikimedia_Foundation_Annual_Plan/2023-2024/Draft/External_Trends/Community_call_notes>.
> I hope we continue the vibrant discussion we started together on the Talk
> Page. Maybe someone can use that space to volunteer to host the next call?
> I know many folks are eager to continue the live discussion too.
>
> I also wanted to share a few links / resources that might be useful (I'll
> add these to the Talk page as well):
>
>    - WMF's Legal team recently did a copyright analysis of ChatGPT. You
>    can find that on Meta
>    <https://meta.wikimedia.org/wiki/Wikilegal/Copyright_Analysis_of_ChatGPT>
>    .
>    - There is a proposed session on ChatGPT / generative AI for the
>    Wikimedia Hackathon in May. You can find that on Phabricator
>    <https://phabricator.wikimedia.org/T333127>.
>
> Finally, a huge thank you to @Maryana Pinchuk <[email protected]> who
> took the extensive and detailed notes on the call and also did a lot of
> "wrangling" behind the scenes to help draft the External Trends in the
> first place and get us to a point where we could have this discussion.
> Thank you, Maryana!
>
> Feel free to reach out anytime to connect about this or other topics. I'll
> be in Belgrade for the EduWiki conference in May and Singapore for
> Wikimania - if you're coming to either of those events or in the area, let
> me know - I'd love to meet in person!
>
> Best,
>
> Yael
>
> *Yael Weissburg* (she/her)
> VP, Partnerships, Programs & Grantmaking
> Wikimedia Foundation <https://wikimediafoundation.org/>
> M: (+1) 415.513.6643
> I work from San Francisco. My time zone is UTC -7/-8.
>
>
>
> On Fri, Mar 24, 2023 at 2:02 AM Paulo Santos Perneta <
> [email protected]> wrote:
>
>> Yes, please, make this a regular event, at least for the time being.
>> These discussions are incredibly useful, given the speed the developments
>> are happening in this area, and the complexity of the challenges we are
>> facing due to them.
>> And thank's a lot for organizing the meeting yesterday!
>>
>> Paulo
>>
>> Samuel Klein <[email protected]> escreveu no dia quinta, 23/03/2023 à(s)
>> 21:11:
>>
>>> The Bau lab (that produced ROME) is great; see their update MEMIT
>>> https://memit.baulab.info scaling that approach.
>>>
>>> On Thu, Mar 23, 2023 at 3:43 PM Lauren Worden <[email protected]>
>>> wrote:
>>>
>>>> On Thu, Mar 23, 2023 at 12:20 PM Samuel Klein <[email protected]>
>>>> wrote:
>>>>
>>>>> Thanks Yael and all for hosting this!  A great conversation which we
>>>>> should revisit regularly.
>>>>>
>>>>
>>>> Yes, I hope that this can be a (monthly?) regularly occurring event
>>>> given the current state of very substantial advancements and improvements
>>>> in the field.
>>>>
>>>> I want to reiterate some links which I feel may be of considerable help
>>>> to those trying to understand our situation:
>>>>
>>>> RARR: https://arxiv.org/abs/2210.08726
>>>>
>>>> ROME: https://rome.baulab.info/
>>>>
>>>> ROME:
>>>>
>>>> -LW
>>>> _______________________________________________
>>>> Wikimedia-l mailing list -- [email protected],
>>>> guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
>>>> and https://meta.wikimedia.org/wiki/Wikimedia-l
>>>> Public archives at
>>>> https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/ROUPXZQXNZSGXX5HKPLSUKIKZSR7LJT7/
>>>> To unsubscribe send an email to [email protected]
>>>
>>>
>>>
>>> --
>>> Samuel Klein          @metasj           w:user:sj          +1 617 529
>>> 4266
>>> _______________________________________________
>>> Wikimedia-l mailing list -- [email protected], guidelines
>>> at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and
>>> https://meta.wikimedia.org/wiki/Wikimedia-l
>>> Public archives at
>>> https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/IRPDSTNKLEWXE5RRVJHDKHL2OXZZXXN6/
>>> To unsubscribe send an email to [email protected]
>>
>> _______________________________________________
>> Wikimedia-l mailing list -- [email protected], guidelines
>> at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and
>> https://meta.wikimedia.org/wiki/Wikimedia-l
>> Public archives at
>> https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/LNKUGJT3XQEAJCDWNJU5QA6EIZHTHJGZ/
>> To unsubscribe send an email to [email protected]
>
> _______________________________________________
> Wikimedia-l mailing list -- [email protected], guidelines
> at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and
> https://meta.wikimedia.org/wiki/Wikimedia-l
> Public archives at
> https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/EZOJF2VG4LX36SPLJ7PIQG3V4HAHRVRX/
> To unsubscribe send an email to [email protected]
_______________________________________________
Wikimedia-l mailing list -- [email protected], guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/SJKVOSYJDNDITHC3IHQZMFWBWLD5PJYC/
To unsubscribe send an email to [email protected]

Reply via email to