* Frédéric Schütz wrote: >So, why would "404 error" and "File:Hardy_boys_cover_09.jpg" be ranked >so high ? "404 error" is constant through the year (it may be a link >from a 404 page on a web server, but I'd be still surprised that it is >clicked so often), but the other is viewed in bursts (see e.g. >http://stats.grok.se/en/201012/File:Hardy_boys_cover_09.jpg). Any idea ?
As I understand it, the access counts are derived from frontend cache log files without much of an attempt to filter out abnormalities like someone putting a stone on their keyboard to keep the F5 press down, which would cause their web browser to load the same page again and again. More realistically, you have malfunctioning bots that end up in a loop causing them to request the same page many times, and there probably are deliberate attempts to push certain topics (even if you keep it down and request only once per minute, that would still be half a million per year, quite enough to get into "top" lists on the smaller Wikipedia versions, for instance). If you look further down in your list, you'll probably find that Special:Export/* is extremely popular even though it's a very obscure feature, but apparently some articles are exported hundreds of thousands of times. You would need access to additional data, like the IP addresses from where the requests come, or Referer header in requests, and so on, to attempt individual guesses (that data however cannot be published for privacy reasons, I do not know if it is collected at all or in what form). For the 404 error you could probably verify that the page is #1 for queries about it on various search engines in many locales (if you assume there are 2 billion Internet users, and 6.6% bing the error and get to the Wikipedia page, that would already explain the number). So the numbers are rather rough and won't really tell you anything you did not already know (Sex > Astrobiology, no surprise there), and you can't really say Steve Jobs > Justin Bieber based on this data without explaining all the caveats anyway. The data is more useful if you look for general trends like in http://katograph.appspot.com/ which tells you things like that articles on people in Film are viewed much more often than people in Sports which is at least slightly non-obvious. -- Björn Höhrmann · mailto:bjo...@hoehrmann.de · http://bjoern.hoehrmann.de Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de 25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ _______________________________________________ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l