I love this.  The Nigeria data (23M views for *Search* pages, 5M for *main
pages*, 7M for the entire rest of the list!) is a reminder of how important
those are to readers + perhaps how high bounce rates are...  Small
improvements there make huge improvements to site experience :)  Maybe also
improvements to how easily people can get the right search result without
being taken to a special:search page!   It would be great to see a
*country* facet
on topviews
<https://pageviews.wmcloud.org/topviews/?project=de.wikipedia.org&platform=all-access&date=2022-05&excludes=>
.

On Thu, Dec 8, 2022 at 5:53 PM Hal Triedman <[email protected]> wrote:

> Hi all!
>
> Looks like Isaac and I had the same thought here. I also spent ~45 minutes
> hacking together a script that collects the top (up to) 500 pages for a
> given country from 1 December 2021 through 30 November 2022 using the WMF
> pageviews API <https://wikimedia.org/api/rest_v1/#/Pageviews%20data>. All
> of the datasets are relatively small and available for download and free
> use
> <https://analytics.wikimedia.org/published/datasets/most_visited_articles_12.2021-11.2022/>.
> Code for generating these lists is available on the WMF gitlab instance
> <https://gitlab.wikimedia.org/htriedman/annual-top-pages>, and runs in
> ~3.5 hours on a normal Macbook, if anyone wants to download/fork it and try
> it on their own.
>
> There are only 135 ISO codes included in this set of files — I removed
> codes that WMF doesn't release data about or that have no data reported for
> the 365 day period in question. Let me know if you have any questions, and
> hope this helps!
>
> Hal
>
> On Thu, Dec 8, 2022 at 8:18 AM Isaac Johnson <[email protected]> wrote:
>
>> Romaine,
>> Building on Chico's comment, I put together an example notebook of how to
>> estimate such a list from the public data in case you're curious (I
>> calculated it for January-November for Nigeria in the example). It's not a
>> perfect approach in that it makes some assumptions and uses incomplete data
>> but probably is close to what the actual list would be (details in the
>> link). You'd likely want to use your knowledge of the region/languages to
>> filter out pages like Special:Search and bot-driven views that slipped
>> through into the data (like Cookie and Cleopatra in the example below).
>>
>> Notebook:
>> https://public.paws.wmcloud.org/User:Isaac_(WMF)/Top_Read_2022_Geo.ipynb#Example-Results-(Nigeria-for-2022)
>>
>> It makes use of these public Wikimedia resources:
>> * PAWS infrastructure: https://wikitech.wikimedia.org/wiki/PAWS
>> * Pageviews API:
>> https://wikitech.wikimedia.org/wiki/Analytics/AQS/Pageviews
>> * Python mwviews library for interacting with the pageviews API:
>> https://github.com/mediawiki-utilities/python-mwviews
>>
>> You can read instructions for how to copy this notebook and run it for
>> other countries here:
>> https://wikitech.wikimedia.org/wiki/PAWS/Getting_started_with_PAWS#Fork
>>
>> Best,
>> Isaac
>>
>> Copying the top-100 output for Nigeria below for ease of access:
>>
>> article views
>> 1 https://en.wikipedia.org/wiki/Special:Search 13696500
>> 2 https://fr.wikipedia.org/wiki/Cookie_(informatique) 10754500
>> 3 https://ig.wikipedia.org/wiki/Special:Search 7579900
>> 4 https://en.wikipedia.org/wiki/Main_Page 5502800
>> 5 https://ig.wikipedia.org/wiki/Ihü_kárírí:Search 1791900
>> 6 https://foundation.wikimedia.org/wiki/Privacy_policy 870000
>> 7 https://en.wikipedia.org/wiki/Bet9ja 664700
>> 8 https://foundation.wikimedia.org/wiki/Terms_of_Use 646900
>> 9 https://en.wikipedia.org/wiki/XXX 624200
>> 10 https://en.wikipedia.org/wiki/Nigeria 491700
>> 11 https://en.wikipedia.org/wiki/Cleopatra 429900
>> 12 https://en.wikipedia.org/wiki/Elizabeth_II 328400
>> 13 https://en.wikipedia.org/wiki/Bola_Tinubu 320300
>> 14 https://en.wikipedia.org/wiki/XXX_(film_series) 234700
>> 15 https://commons.wikimedia.org/wiki/Commons:Wiki_Loves_Africa_2022/en
>> 230600
>> 16 https://en.wikipedia.org/wiki/Peter_Obi 229000
>> 17 https://fr.wikipedia.org/wiki/Enoch_Adeboye 197300
>> 18
>> https://commons.wikimedia.org/wiki/Commons:Wiki_Loves_Earth_2022_in_Nigeria
>> 154600
>> 19 https://en.wikipedia.org/wiki/XXX:_Return_of_Xander_Cage 143000
>> 20 https://en.wikipedia.org/wiki/Vladimir_Putin 131100
>> 21 https://en.wikipedia.org/wiki/Russo-Ukrainian_War 122700
>> 22 https://en.wikipedia.org/wiki/XXXX_(beer) 116800
>> 23 https://en.wikipedia.org/wiki/Charles_III 114600
>> 24 https://en.wikipedia.org/wiki/Africa_Cup_of_Nations 112300
>> 25 https://en.wikipedia.org/wiki/Jeffrey_Dahmer 110300
>> 26 https://en.wikipedia.org/wiki/Yusuf_Datti_Baba-Ahmed 108700
>> 27 https://en.wikipedia.org/wiki/Cristiano_Ronaldo 106300
>> 28
>> https://commons.wikimedia.org/wiki/Commons:Wiki_Loves_Earth_2022_in_South_West_Nigeria
>> 99700
>> 29 https://en.wikipedia.org/wiki/Atiku_Abubakar 91800
>> 30 https://en.wikipedia.org/wiki/2022_FIFA_World_Cup 91300
>> 31 https://en.wikipedia.org/wiki/NATO 86300
>> 32 https://en.wikipedia.org/wiki/Erling_Haaland 84800
>> 33 https://en.wikipedia.org/wiki/Russia–Ukraine_relations 84300
>> 34 https://en.wikipedia.org/wiki/2021_Africa_Cup_of_Nations 83300
>> 35 https://en.wikipedia.org/wiki/Diana,_Princess_of_Wales 81900
>> 36 https://en.wikipedia.org/wiki/Black_Adam_(film) 80600
>> 37 https://en.wikipedia.org/wiki/Black_Panther:_Wakanda_Forever 66800
>> 38 https://en.wikipedia.org/wiki/Ademola_Adeleke 66600
>> 39 https://en.wikipedia.org/wiki/Ukraine 65100
>> 40 https://en.wikipedia.org/wiki/Rishi_Sunak 60700
>> 41 https://en.wikipedia.org/wiki/Elon_Musk 60200
>> 42 https://en.wikipedia.org/wiki/Takeoff_(rapper) 58000
>> 43 https://en.wikipedia.org/wiki/House_of_the_Dragon 57500
>> 44 https://en.wikipedia.org/wiki/Casemiro 56800
>> 45 https://en.wikipedia.org/wiki/Prince_Philip,_Duke_of_Edinburgh 56500
>> 46 https://en.wikipedia.org/wiki/Member_states_of_NATO 56300
>> 47 https://en.wikipedia.org/wiki/Tobi_Amusan 54000
>> 48 https://en.wikipedia.org/wiki/George_VI 53400
>> 49 https://en.wikipedia.org/wiki/2022_Kenyan_general_election 53100
>> 50 https://en.wikipedia.org/wiki/2022_Russian_invasion_of_Ukraine 52900
>> 51 https://en.wikipedia.org/wiki/Kashim_Shettima 52400
>> 52 https://en.wikipedia.org/wiki/File:WhatsApp.svg 51000
>> 53 https://wikimania.wikimedia.org/wiki/Registration 48200
>> 54 https://en.wikipedia.org/wiki/Prince_Harry,_Duke_of_Sussex 47700
>> 55 https://en.wikipedia.org/wiki/The_Woman_King 45900
>> 56 https://en.wikipedia.org/wiki/Graham_Potter 44900
>> 57 https://en.wikipedia.org/wiki/Pierre-Emerick_Aubameyang 43900
>> 58 https://en.wikipedia.org/wiki/Antony_(footballer,_born_2000) 43700
>> 59 https://en.wikipedia.org/wiki/Ada_Ameh 43500
>> 60 https://en.wikipedia.org/wiki/Vincent_Aboubakar 43100
>> 61 https://en.wikipedia.org/wiki/Russia 42300
>> 62 https://en.wikipedia.org/wiki/Lionel_Messi 42300
>> 63 https://en.wikipedia.org/wiki/Moses_Simon 41800
>> 64 https://en.wikipedia.org/wiki/Karim_Benzema 41000
>> 65 https://en.wikipedia.org/wiki/List_of_political_parties_in_Nigeria
>> 40500
>> 66 https://en.wikipedia.org/wiki/History_of_Nigeria 40200
>> 67 https://en.wikipedia.org/wiki/Bianca_Odumegwu-Ojukwu 39400
>> 68 https://en.wikipedia.org/wiki/Ahmad_Lawan 39400
>> 69
>> https://en.wikipedia.org/wiki/Doctor_Strange_in_the_Multiverse_of_Madness
>> 39300
>> 70 https://en.wikipedia.org/wiki/2022_Women's_Africa_Cup_of_Nations 38500
>> 71
>> https://commons.wikimedia.org/wiki/Commons:Wiki_Loves_Monuments_2022_in_Nigeria
>> 37700
>> 72 https://en.wikipedia.org/wiki/List_of_capitals_of_states_of_Nigeria
>> 36500
>> 73 https://en.wikipedia.org/wiki/Valentine's_Day 36200
>> 74 https://en.wikipedia.org/wiki/Liz_Truss 35200
>> 75 https://en.wikipedia.org/wiki/Maduka_Okoye 34800
>> 76 https://en.wikipedia.org/wiki/Soviet_Union 34600
>> 77 https://en.wikipedia.org/wiki/Raheem_Sterling 33700
>> 78 https://en.wikipedia.org/wiki/Roman_Abramovich 32400
>> 79 https://en.wikipedia.org/wiki/Anne,_Princess_Royal 32300
>> 80 https://en.wikipedia.org/wiki/Edward_VIII 32000
>> 81 https://en.wikipedia.org/wiki/William,_Prince_of_Wales 32000
>> 82 https://en.wikipedia.org/wiki/Ìyál'ọ́jà 31900
>> 83 https://en.wikipedia.org/wiki/Simon_Leviev 31300
>> 84 https://en.wikipedia.org/wiki/Alchemy_of_Souls 31100
>> 85 https://en.wikipedia.org/wiki/Volodymyr_Zelenskyy 29300
>> 86 https://en.wikipedia.org/wiki/List_of_state_governors_of_Nigeria 28700
>> 87 https://en.wikipedia.org/wiki/Isiaka_Adeleke 28500
>> 88 https://en.wikipedia.org/wiki/The_Headies_2022 28300
>> 89 https://en.wikipedia.org/wiki/Lisandro_Martínez 28200
>> 90 https://en.wikipedia.org/wiki/XXX:_State_of_the_Union 27100
>> 91 https://en.wikipedia.org/wiki/Independence_Day_(Nigeria) 27000
>> 92 https://en.wikipedia.org/wiki/Women's_Africa_Cup_of_Nations 26800
>> 93 https://en.wikipedia.org/wiki/Big_Brother_Naija_(season_7) 26800
>> 94 https://en.wikipedia.org/wiki/Marc_Cucurella 26500
>> 95 https://en.wikipedia.org/wiki/FIFA_World_Cup 26300
>> 96
>> https://en.wikipedia.org/wiki/List_of_Nigerian_Grammy_Award_winners_and_nominees
>> 26100
>> 97 https://en.wikipedia.org/wiki/Qatar 26100
>> 98 https://meta.wikimedia.org/wiki/International_Museum_Day_2022 25800
>> 99 https://en.wikipedia.org/wiki/Joyce_Vincent 25700
>> 100 https://en.wikipedia.org/wiki/Aníkúlápó_(2022_film) 25300
>>
>>
>> On Wed, Dec 7, 2022 at 7:41 PM Romaine Wiki <[email protected]>
>> wrote:
>>
>>> For some languages it is easy as a particular language is spoken in one
>>> country mainly. (Still there might be some local languages/dialects that
>>> are then not represented in the data.)
>>>
>>> For some other languages is is not easy to get the statistics of the
>>> most visited pages of a country as the language is spoken in multiple
>>> countries.
>>>
>>> If for example one country only has 3% of the population in comparison
>>> to another country with the same language, the language statistics are very
>>> biased. The larger country consumes so much data, that the data of the
>>> country with the smaller population is invisible. If we have no data for
>>> them, we let those unseen communities down.
>>>
>>> Romaine
>>>
>>>
>>> Op wo 7 dec. 2022 om 18:21 schreef Jan Ainali <[email protected]>:
>>>
>>>> On Swedish Wikipedia we collect it on one page:
>>>> https://sv.wikipedia.org/wiki/Wikipedia:Mest_visade_artiklar_2022
>>>>
>>>> Doing it per language is much easier than per country, as the data is
>>>> publicly available.
>>>>
>>>> Best,
>>>> Jan Ainali
>>>>
>>>>
>>>> Den ons 7 dec. 2022 kl 16:36 skrev Romaine Wiki <[email protected]
>>>> >:
>>>>
>>>>> Every year it reaches the headlines of the news: the top 10 or top 100
>>>>> of most visited Google searches of the past year in my country. This I 
>>>>> have
>>>>> seen in some other countries too.
>>>>>
>>>>> People are interested and with making this data public, something
>>>>> positive is said about Google (besides all the negatieve news about them
>>>>> during the rest of the year).
>>>>>
>>>>> This is something simple Wikimedia could do too: sharing this kind of
>>>>> data (*by country*) with the world. It would bring Wikipedia closer to the
>>>>> public, more positive awareness.
>>>>> Or otherwise making this data available to the local chapters so they
>>>>> can bring positive news about Wikipedia.
>>>>>
>>>>>
>>>>> Romaine
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Wikimedia-l mailing list -- [email protected],
>>>>> guidelines at:
>>>>> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and
>>>>> https://meta.wikimedia.org/wiki/Wikimedia-l
>>>>> Public archives at
>>>>> https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/HXWUCYYRLL44LFIPZ6YXHLLDL7H63ZKD/
>>>>> To unsubscribe send an email to [email protected]
>>>>
>>>> _______________________________________________
>>>> Wikimedia-l mailing list -- [email protected],
>>>> guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
>>>> and https://meta.wikimedia.org/wiki/Wikimedia-l
>>>> Public archives at
>>>> https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/INAMQFF4MPWHCFRXRDOSTNQH7S46Q3K5/
>>>> To unsubscribe send an email to [email protected]
>>>
>>> _______________________________________________
>>> Wikimedia-l mailing list -- [email protected], guidelines
>>> at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and
>>> https://meta.wikimedia.org/wiki/Wikimedia-l
>>> Public archives at
>>> https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/AXYFSIPGTZMKKBVKHULBAK5BO5MUHSTH/
>>> To unsubscribe send an email to [email protected]
>>
>>
>>
>> --
>> Isaac Johnson (he/him/his) -- Senior Research Scientist -- Wikimedia
>> Foundation
>> _______________________________________________
>> Wikimedia-l mailing list -- [email protected], guidelines
>> at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and
>> https://meta.wikimedia.org/wiki/Wikimedia-l
>> Public archives at
>> https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/3KZTYDTR6D43BL2CZZUSGYDHIUBGASWU/
>> To unsubscribe send an email to [email protected]
>
> _______________________________________________
> Wikimedia-l mailing list -- [email protected], guidelines
> at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and
> https://meta.wikimedia.org/wiki/Wikimedia-l
> Public archives at
> https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/ESRFRZFWB7JMI62IUVPRZTNFEQTM64BM/
> To unsubscribe send an email to [email protected]



-- 
Samuel Klein          @metasj           w:user:sj          +1 617 529 4266
_______________________________________________
Wikimedia-l mailing list -- [email protected], guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/[email protected]/message/7JHF7YUVRY65UBPWYAHF5VWTJBIXMID7/
To unsubscribe send an email to [email protected]

Reply via email to