I am from time to time using dumps for parsing data that I cannot get via SQL/API. For example in summer I fetched Wikimedia Commons page history for getting the list of old categories of images so that I would not be re-inserting categories by bot which were least once removed from the photo.
Br, -- Kimmo Virtanen, Zache On Tue, Oct 8, 2024 at 6:59 PM Bryan Davis <bd...@wikimedia.org> wrote: > I was asked recently what I knew about the types of tools that use > data from the https://dumps.wikimedia.org/ project. I had to admit > that I really didn't know of many tools off the top of my head that > relied on dumps. Most of the use cases I have heard about are for > research topics like looking at word frequencies and sentence > complexity, or machine learning things that consume some or all of the > wiki corpus. > > Do you run a tool that needs data from Dumps to do its job? I would > love to hear some stories about how this data helps folks advance the > work of the movement. > > Bryan > -- > Bryan Davis Wikimedia Foundation > Principal Software Engineer Boise, ID USA > [[m:User:BDavis_(WMF)]] irc: bd808 > _______________________________________________ > Cloud mailing list -- cloud@lists.wikimedia.org > List information: > https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/ >
_______________________________________________ Cloud mailing list -- cloud@lists.wikimedia.org List information: https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/