I am from time to time using dumps for parsing data that I cannot get via
SQL/API. For example in summer I fetched Wikimedia Commons page history for
getting the list of old categories of images so that I would not be
re-inserting categories by bot which were least once removed from the photo.

 Br,
-- Kimmo Virtanen, Zache

On Tue, Oct 8, 2024 at 6:59 PM Bryan Davis <bd...@wikimedia.org> wrote:

> I was asked recently what I knew about the types of tools that use
> data from the https://dumps.wikimedia.org/ project. I had to admit
> that I really didn't know of many tools off the top of my head that
> relied on dumps. Most of the use cases I have heard about are for
> research topics like looking at word frequencies and sentence
> complexity, or machine learning things that consume some or all of the
> wiki corpus.
>
> Do you run a tool that needs data from Dumps to do its job? I would
> love to hear some stories about how this data helps folks advance the
> work of the movement.
>
> Bryan
> --
> Bryan Davis                                        Wikimedia Foundation
> Principal Software Engineer                               Boise, ID USA
> [[m:User:BDavis_(WMF)]]                                      irc: bd808
> _______________________________________________
> Cloud mailing list -- cloud@lists.wikimedia.org
> List information:
> https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/
>
_______________________________________________
Cloud mailing list -- cloud@lists.wikimedia.org
List information: 
https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/

Reply via email to