I love this. The Nigeria data (23M views for *Search* pages, 5M for *main pages*, 7M for the entire rest of the list!) is a reminder of how important those are to readers + perhaps how high bounce rates are... Small improvements there make huge improvements to site experience :) Maybe also improvements to how easily people can get the right search result without being taken to a special:search page! It would be great to see a *country* facet on topviews <https://pageviews.wmcloud.org/topviews/?project=de.wikipedia.org&platform=all-access&date=2022-05&excludes=> .
On Thu, Dec 8, 2022 at 5:53 PM Hal Triedman <[email protected]> wrote: > Hi all! > > Looks like Isaac and I had the same thought here. I also spent ~45 minutes > hacking together a script that collects the top (up to) 500 pages for a > given country from 1 December 2021 through 30 November 2022 using the WMF > pageviews API <https://wikimedia.org/api/rest_v1/#/Pageviews%20data>. All > of the datasets are relatively small and available for download and free > use > <https://analytics.wikimedia.org/published/datasets/most_visited_articles_12.2021-11.2022/>. > Code for generating these lists is available on the WMF gitlab instance > <https://gitlab.wikimedia.org/htriedman/annual-top-pages>, and runs in > ~3.5 hours on a normal Macbook, if anyone wants to download/fork it and try > it on their own. > > There are only 135 ISO codes included in this set of files — I removed > codes that WMF doesn't release data about or that have no data reported for > the 365 day period in question. Let me know if you have any questions, and > hope this helps! > > Hal > > On Thu, Dec 8, 2022 at 8:18 AM Isaac Johnson <[email protected]> wrote: > >> Romaine, >> Building on Chico's comment, I put together an example notebook of how to >> estimate such a list from the public data in case you're curious (I >> calculated it for January-November for Nigeria in the example). It's not a >> perfect approach in that it makes some assumptions and uses incomplete data >> but probably is close to what the actual list would be (details in the >> link). You'd likely want to use your knowledge of the region/languages to >> filter out pages like Special:Search and bot-driven views that slipped >> through into the data (like Cookie and Cleopatra in the example below). >> >> Notebook: >> https://public.paws.wmcloud.org/User:Isaac_(WMF)/Top_Read_2022_Geo.ipynb#Example-Results-(Nigeria-for-2022) >> >> It makes use of these public Wikimedia resources: >> * PAWS infrastructure: https://wikitech.wikimedia.org/wiki/PAWS >> * Pageviews API: >> https://wikitech.wikimedia.org/wiki/Analytics/AQS/Pageviews >> * Python mwviews library for interacting with the pageviews API: >> https://github.com/mediawiki-utilities/python-mwviews >> >> You can read instructions for how to copy this notebook and run it for >> other countries here: >> https://wikitech.wikimedia.org/wiki/PAWS/Getting_started_with_PAWS#Fork >> >> Best, >> Isaac >> >> Copying the top-100 output for Nigeria below for ease of access: >> >> article views >> 1 https://en.wikipedia.org/wiki/Special:Search 13696500 >> 2 https://fr.wikipedia.org/wiki/Cookie_(informatique) 10754500 >> 3 https://ig.wikipedia.org/wiki/Special:Search 7579900 >> 4 https://en.wikipedia.org/wiki/Main_Page 5502800 >> 5 https://ig.wikipedia.org/wiki/Ihü_kárírí:Search 1791900 >> 6 https://foundation.wikimedia.org/wiki/Privacy_policy 870000 >> 7 https://en.wikipedia.org/wiki/Bet9ja 664700 >> 8 https://foundation.wikimedia.org/wiki/Terms_of_Use 646900 >> 9 https://en.wikipedia.org/wiki/XXX 624200 >> 10 https://en.wikipedia.org/wiki/Nigeria 491700 >> 11 https://en.wikipedia.org/wiki/Cleopatra 429900 >> 12 https://en.wikipedia.org/wiki/Elizabeth_II 328400 >> 13 https://en.wikipedia.org/wiki/Bola_Tinubu 320300 >> 14 https://en.wikipedia.org/wiki/XXX_(film_series) 234700 >> 15 https://commons.wikimedia.org/wiki/Commons:Wiki_Loves_Africa_2022/en >> 230600 >> 16 https://en.wikipedia.org/wiki/Peter_Obi 229000 >> 17 https://fr.wikipedia.org/wiki/Enoch_Adeboye 197300 >> 18 >> https://commons.wikimedia.org/wiki/Commons:Wiki_Loves_Earth_2022_in_Nigeria >> 154600 >> 19 https://en.wikipedia.org/wiki/XXX:_Return_of_Xander_Cage 143000 >> 20 https://en.wikipedia.org/wiki/Vladimir_Putin 131100 >> 21 https://en.wikipedia.org/wiki/Russo-Ukrainian_War 122700 >> 22 https://en.wikipedia.org/wiki/XXXX_(beer) 116800 >> 23 https://en.wikipedia.org/wiki/Charles_III 114600 >> 24 https://en.wikipedia.org/wiki/Africa_Cup_of_Nations 112300 >> 25 https://en.wikipedia.org/wiki/Jeffrey_Dahmer 110300 >> 26 https://en.wikipedia.org/wiki/Yusuf_Datti_Baba-Ahmed 108700 >> 27 https://en.wikipedia.org/wiki/Cristiano_Ronaldo 106300 >> 28 >> https://commons.wikimedia.org/wiki/Commons:Wiki_Loves_Earth_2022_in_South_West_Nigeria >> 99700 >> 29 https://en.wikipedia.org/wiki/Atiku_Abubakar 91800 >> 30 https://en.wikipedia.org/wiki/2022_FIFA_World_Cup 91300 >> 31 https://en.wikipedia.org/wiki/NATO 86300 >> 32 https://en.wikipedia.org/wiki/Erling_Haaland 84800 >> 33 https://en.wikipedia.org/wiki/Russia–Ukraine_relations 84300 >> 34 https://en.wikipedia.org/wiki/2021_Africa_Cup_of_Nations 83300 >> 35 https://en.wikipedia.org/wiki/Diana,_Princess_of_Wales 81900 >> 36 https://en.wikipedia.org/wiki/Black_Adam_(film) 80600 >> 37 https://en.wikipedia.org/wiki/Black_Panther:_Wakanda_Forever 66800 >> 38 https://en.wikipedia.org/wiki/Ademola_Adeleke 66600 >> 39 https://en.wikipedia.org/wiki/Ukraine 65100 >> 40 https://en.wikipedia.org/wiki/Rishi_Sunak 60700 >> 41 https://en.wikipedia.org/wiki/Elon_Musk 60200 >> 42 https://en.wikipedia.org/wiki/Takeoff_(rapper) 58000 >> 43 https://en.wikipedia.org/wiki/House_of_the_Dragon 57500 >> 44 https://en.wikipedia.org/wiki/Casemiro 56800 >> 45 https://en.wikipedia.org/wiki/Prince_Philip,_Duke_of_Edinburgh 56500 >> 46 https://en.wikipedia.org/wiki/Member_states_of_NATO 56300 >> 47 https://en.wikipedia.org/wiki/Tobi_Amusan 54000 >> 48 https://en.wikipedia.org/wiki/George_VI 53400 >> 49 https://en.wikipedia.org/wiki/2022_Kenyan_general_election 53100 >> 50 https://en.wikipedia.org/wiki/2022_Russian_invasion_of_Ukraine 52900 >> 51 https://en.wikipedia.org/wiki/Kashim_Shettima 52400 >> 52 https://en.wikipedia.org/wiki/File:WhatsApp.svg 51000 >> 53 https://wikimania.wikimedia.org/wiki/Registration 48200 >> 54 https://en.wikipedia.org/wiki/Prince_Harry,_Duke_of_Sussex 47700 >> 55 https://en.wikipedia.org/wiki/The_Woman_King 45900 >> 56 https://en.wikipedia.org/wiki/Graham_Potter 44900 >> 57 https://en.wikipedia.org/wiki/Pierre-Emerick_Aubameyang 43900 >> 58 https://en.wikipedia.org/wiki/Antony_(footballer,_born_2000) 43700 >> 59 https://en.wikipedia.org/wiki/Ada_Ameh 43500 >> 60 https://en.wikipedia.org/wiki/Vincent_Aboubakar 43100 >> 61 https://en.wikipedia.org/wiki/Russia 42300 >> 62 https://en.wikipedia.org/wiki/Lionel_Messi 42300 >> 63 https://en.wikipedia.org/wiki/Moses_Simon 41800 >> 64 https://en.wikipedia.org/wiki/Karim_Benzema 41000 >> 65 https://en.wikipedia.org/wiki/List_of_political_parties_in_Nigeria >> 40500 >> 66 https://en.wikipedia.org/wiki/History_of_Nigeria 40200 >> 67 https://en.wikipedia.org/wiki/Bianca_Odumegwu-Ojukwu 39400 >> 68 https://en.wikipedia.org/wiki/Ahmad_Lawan 39400 >> 69 >> https://en.wikipedia.org/wiki/Doctor_Strange_in_the_Multiverse_of_Madness >> 39300 >> 70 https://en.wikipedia.org/wiki/2022_Women's_Africa_Cup_of_Nations 38500 >> 71 >> https://commons.wikimedia.org/wiki/Commons:Wiki_Loves_Monuments_2022_in_Nigeria >> 37700 >> 72 https://en.wikipedia.org/wiki/List_of_capitals_of_states_of_Nigeria >> 36500 >> 73 https://en.wikipedia.org/wiki/Valentine's_Day 36200 >> 74 https://en.wikipedia.org/wiki/Liz_Truss 35200 >> 75 https://en.wikipedia.org/wiki/Maduka_Okoye 34800 >> 76 https://en.wikipedia.org/wiki/Soviet_Union 34600 >> 77 https://en.wikipedia.org/wiki/Raheem_Sterling 33700 >> 78 https://en.wikipedia.org/wiki/Roman_Abramovich 32400 >> 79 https://en.wikipedia.org/wiki/Anne,_Princess_Royal 32300 >> 80 https://en.wikipedia.org/wiki/Edward_VIII 32000 >> 81 https://en.wikipedia.org/wiki/William,_Prince_of_Wales 32000 >> 82 https://en.wikipedia.org/wiki/Ìyál'ọ́jà 31900 >> 83 https://en.wikipedia.org/wiki/Simon_Leviev 31300 >> 84 https://en.wikipedia.org/wiki/Alchemy_of_Souls 31100 >> 85 https://en.wikipedia.org/wiki/Volodymyr_Zelenskyy 29300 >> 86 https://en.wikipedia.org/wiki/List_of_state_governors_of_Nigeria 28700 >> 87 https://en.wikipedia.org/wiki/Isiaka_Adeleke 28500 >> 88 https://en.wikipedia.org/wiki/The_Headies_2022 28300 >> 89 https://en.wikipedia.org/wiki/Lisandro_Martínez 28200 >> 90 https://en.wikipedia.org/wiki/XXX:_State_of_the_Union 27100 >> 91 https://en.wikipedia.org/wiki/Independence_Day_(Nigeria) 27000 >> 92 https://en.wikipedia.org/wiki/Women's_Africa_Cup_of_Nations 26800 >> 93 https://en.wikipedia.org/wiki/Big_Brother_Naija_(season_7) 26800 >> 94 https://en.wikipedia.org/wiki/Marc_Cucurella 26500 >> 95 https://en.wikipedia.org/wiki/FIFA_World_Cup 26300 >> 96 >> https://en.wikipedia.org/wiki/List_of_Nigerian_Grammy_Award_winners_and_nominees >> 26100 >> 97 https://en.wikipedia.org/wiki/Qatar 26100 >> 98 https://meta.wikimedia.org/wiki/International_Museum_Day_2022 25800 >> 99 https://en.wikipedia.org/wiki/Joyce_Vincent 25700 >> 100 https://en.wikipedia.org/wiki/Aníkúlápó_(2022_film) 25300 >> >> >> On Wed, Dec 7, 2022 at 7:41 PM Romaine Wiki <[email protected]> >> wrote: >> >>> For some languages it is easy as a particular language is spoken in one >>> country mainly. (Still there might be some local languages/dialects that >>> are then not represented in the data.) >>> >>> For some other languages is is not easy to get the statistics of the >>> most visited pages of a country as the language is spoken in multiple >>> countries. >>> >>> If for example one country only has 3% of the population in comparison >>> to another country with the same language, the language statistics are very >>> biased. The larger country consumes so much data, that the data of the >>> country with the smaller population is invisible. If we have no data for >>> them, we let those unseen communities down. >>> >>> Romaine >>> >>> >>> Op wo 7 dec. 2022 om 18:21 schreef Jan Ainali <[email protected]>: >>> >>>> On Swedish Wikipedia we collect it on one page: >>>> https://sv.wikipedia.org/wiki/Wikipedia:Mest_visade_artiklar_2022 >>>> >>>> Doing it per language is much easier than per country, as the data is >>>> publicly available. >>>> >>>> Best, >>>> Jan Ainali >>>> >>>> >>>> Den ons 7 dec. 2022 kl 16:36 skrev Romaine Wiki <[email protected] >>>> >: >>>> >>>>> Every year it reaches the headlines of the news: the top 10 or top 100 >>>>> of most visited Google searches of the past year in my country. This I >>>>> have >>>>> seen in some other countries too. >>>>> >>>>> People are interested and with making this data public, something >>>>> positive is said about Google (besides all the negatieve news about them >>>>> during the rest of the year). >>>>> >>>>> This is something simple Wikimedia could do too: sharing this kind of >>>>> data (*by country*) with the world. It would bring Wikipedia closer to the >>>>> public, more positive awareness. >>>>> Or otherwise making this data available to the local chapters so they >>>>> can bring positive news about Wikipedia. >>>>> >>>>> >>>>> Romaine >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Wikimedia-l mailing list -- [email protected], >>>>> guidelines at: >>>>> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and >>>>> https://meta.wikimedia.org/wiki/Wikimedia-l >>>>> Public archives at >>>>> https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/HXWUCYYRLL44LFIPZ6YXHLLDL7H63ZKD/ >>>>> To unsubscribe send an email to [email protected] >>>> >>>> _______________________________________________ >>>> Wikimedia-l mailing list -- [email protected], >>>> guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines >>>> and https://meta.wikimedia.org/wiki/Wikimedia-l >>>> Public archives at >>>> https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/INAMQFF4MPWHCFRXRDOSTNQH7S46Q3K5/ >>>> To unsubscribe send an email to [email protected] >>> >>> _______________________________________________ >>> Wikimedia-l mailing list -- [email protected], guidelines >>> at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and >>> https://meta.wikimedia.org/wiki/Wikimedia-l >>> Public archives at >>> https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/AXYFSIPGTZMKKBVKHULBAK5BO5MUHSTH/ >>> To unsubscribe send an email to [email protected] >> >> >> >> -- >> Isaac Johnson (he/him/his) -- Senior Research Scientist -- Wikimedia >> Foundation >> _______________________________________________ >> Wikimedia-l mailing list -- [email protected], guidelines >> at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and >> https://meta.wikimedia.org/wiki/Wikimedia-l >> Public archives at >> https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/3KZTYDTR6D43BL2CZZUSGYDHIUBGASWU/ >> To unsubscribe send an email to [email protected] > > _______________________________________________ > Wikimedia-l mailing list -- [email protected], guidelines > at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and > https://meta.wikimedia.org/wiki/Wikimedia-l > Public archives at > https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/ESRFRZFWB7JMI62IUVPRZTNFEQTM64BM/ > To unsubscribe send an email to [email protected] -- Samuel Klein @metasj w:user:sj +1 617 529 4266
_______________________________________________ Wikimedia-l mailing list -- [email protected], guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l Public archives at https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/7JHF7YUVRY65UBPWYAHF5VWTJBIXMID7/ To unsubscribe send an email to [email protected]
