On Thu, Aug 20, 2009 at 7:58 PM, Robert Rohde <raro...@gmail.com> wrote: > > You seem to be identifying all errors with vandalism.
How so? > Sometimes factual errors are simply unintentional mistakes. Obviously we can't know the intent of the person for sure, but after a mistake is found it's relatively simple to find where it was added and decide whether or not we are going to call it vandalism. This is an inherent problem with answering the question. If you can't determine it manually, you sure as hell won't be able to determine it using automated methods. > Let me describe the issue differently. The practical issue I am > concerned with might be better expressed as the following: For any > given article, what is the probability that the current revision is > not the best available revision (i.e. most accurate, most complete, > etc.) Vandalism, in general, takes a page and makes it worse. I am interested in the problem of characterizing how often this happens > with an eye to being able to go back to that prior better version. > (This also explains why I am less interested in vandalism that > persists through many revisions. Once that occurs, it makes less > sense to try and go back to the pre-vandalized revision.) *nod*. Yes, we certainly have different things we're interested in measuring. If someone vandalizes an article, say to change the population of a country from 3 million to 2.9 million, and then 20 other people improve the article without fixing that fact, I'd still count that as vandalized. On the other hand, are you sure you don't want to add an "indisputably" before "not the best available revision"? I mean, I'd say Wikipedia is probably in the double digit percentages, at least in terms of popular articles, if you don't add "indisputably". _______________________________________________ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l