Hi Bryan,

Le mar. 8 oct. 2024 à 17:59, Bryan Davis <bd...@wikimedia.org> a écrit :
> Do you run a tool that needs data from Dumps to do its job? I would
> love to hear some stories about how this data helps folks advance the
> work of the movement.

Socksfinder¹ uses stub-meta-history to build an index of edits used to
find how often multiple accounts edit the same articles and help
identify sockpuppets.

Arkbot² uses pages-articles to build various lists of articles that
share certain properties to help maintenance projects on the French
Wikipedia (list of pages not linked to something…). Some of these
lists used to be provided as special pages by MediaWiki and were
disabled because of performance concerns, some are too specific to be
part of MediaWiki.

In both cases, I'm not 100 % certain dumps are the best approach (I've
been thinking about using SQL queries on replicas and some of the
available APIs), but it works well enough™ and no other approach was
so obviously better (if at all) for me to feel an urgent need to
rewrite my tools.

Also in the past I've used Wikidata dumps to explore the limits of
some RDF tools, found the limits faster than I thought, and moved on
to other hobbies :-)

Best regards,

¹ https://socksfinder.toolforge.org/ (https://github.com/Arkanosis/socksfinder)
² https://fr.wikipedia.org/wiki/Utilisateur:Arkbot
(https://github.com/Arkanosis/arkbot-rs)

-- 
Jérémie
_______________________________________________
Cloud mailing list -- cloud@lists.wikimedia.org
List information: 
https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/

Reply via email to