Anthony wrote: > Wow, what's Wikipedia's policy about using a bot to scrape everything? >
I don't know about any policy, but I think it should still be discouraged. For me this has less to do with predation on other sites than with our inability to keep up with the volume of data that would be produced. Proofreading and wikifying are labour-intensive processes. It is very easy for the technically minded to bring the scan and OCR of a 500-page book under our roof, but without the manpower to bring the added value these processes are scarcely better than data dumps. Ec > On Sat, Jun 20, 2009 at 2:47 PM, Brian <brian.min...@colorado.edu> wrote: > >> That is against the law. It violates Google's ToS. >> >> I'm mostly complaining that Google is being Very Evil. There is nothing we >> can do about it except complain to them. Which I don't know how to do - >> they >> apparently believe that the plain text versions of their books are akin to >> their intellectual property and are unwilling to give them away. >> >> On Sat, Jun 20, 2009 at 12:34 PM, Falcorian wrote: >> >>> So the bot just has to run at human speeds so it does not get banned, it >>> still won't get tired or make unpredictable mistakes. And you can run it >>> from different IPs to parallelize. >>> >>> --Falcorian _______________________________________________ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l