Hey! I mentioned this at the meeting tonight and I thought I'd share it - and wondering if anyone here has thoughts on how to script this to make it a little more systematic?
My project was to improve the diversity of photos of careers, since NPOV is slightly ambiguous there and we know there's impact for kids in terms of representation. The basic strategy was a wikidata query on… jobs? careers? This was then joined with pageview data, so that I could prioritize the pages by traffic. Someone on twitter helped me find the right Wikidata items and construct the query; sadly I can't find it in my notes, though. (I’ve been poking at building a new one with the help of chatgpt but haven’t had much time for it.) The output was a csv that I then jammed into Google Sheets to track it, but presumably it wouldn't be that hard to regenerate the list dynamically (and extend it beyond enwiki). I then simply did a lot of Flickr, IA, and usgov searches to find better photos - not just women, also geographic/racial diversity. Some were pretty easy (especially where the US government has many people in the named career role) but others harder. As a general matter, I didn't start with Commons; I mostly assumed I had to look off Commons first and then bring the images to Commons, though that wasn't always true. Some example edits: - adding women, an African, and an Asian to “Presidents”: https://en.wikipedia.org/w/index.php?title=President_(government_title)&diff=prev&oldid=841458719 - add a woman to "Sommelier": https://en.wikipedia.org/w/index.php?title=Sommelier&diff=prev&oldid=842546650 - add an African man and Mexican group to “Chef”: https://en.wikipedia.org/w/index.php?title=Chef&diff=prev&oldid=842550576 - add a gender-diverse photo and black man to “System Administrator”; if I recall correctly the black man was reverted but i didn’t fight it too hard: https://en.wikipedia.org/w/index.php?title=System_administrator&diff=prev&oldid=840728240 I seem to recall that the attempt to diversify “Lawyer” was reverted, but most stuck at least in the short term. Now that Wikidata has matured, and maybe more photos out there, it'd be interesting to turn this into something more structured — eg, there's obviously problems with relying on LLMs to do gender identification of photos, but as a first pass to identify the most problematic pages? Anyway, throwing that out into the void- Luis
_______________________________________________ Wikimedia-SF mailing list -- [email protected] To unsubscribe send an email to [email protected]
