On Mon, 25 Sep 2006 13:51:55 +0200, Fredrik Lundh wrote: > http://www.google.com/terms_of_service.html > > "You may not send automated queries of any sort to Google's system > without express > permission in advance from Google."
I'm not just being a pedantic weasel here, but what's an automated query? Google's ToS is a legal document (maybe), and if both parties don't agree on the meanings of terms, well, then it is a lousy legal document and a recipe for trouble. Google don't define "automated query"it, and I don't think they can. In fact, the closest they come to defining it is to list three things they want to prevent, NONE of which have anything to do with the distinction between automated and non-automated. (What on earth is "meta-searching"? If you're going to use terms which don't have a commonly understood meaning, define what they mean.) If I want to search for "foo", and I type "foo" into the Firefox search box, is that an automated query? What if I type "gg: foo" into Konqueror's address bar, which expands to "http://www.google.com/search?q=foo"? Is it okay if I type the URL by hand myself? Can I use the browser to save the search page to a local HTML file? If Google says no, how can they possibly hope to stop me? What if I type this command into my shell? elinks --dump "http://www.google.com/search?q=foo" > output.html What if I type wget "http://www.google.com/search?q=foo" into the shell? Surely that's no more automated than typing "foo" into Google's search box. (wget doesn't in fact work, as Google recognises its user-agent string and blocks it, EVEN in cases where I am using wget manually. What, can't Google themselves tell the difference between automatic and non-automatic searching?) Where is the line I must not cross? The thing is, Google doesn't want people "reselling" their services, and I respect Google's intention. But trying to draw a distinction between "automated" and "non-automated" requests is difficult if not impossible, as can be seen by the heavy-handed way Google blocks the manual use of wget. I don't condone the gross abuse of Google's service, but I don't think an artificial distinction between automated and non-automated is a useful way to go about it. Of course, what I think isn't important. If Google wants to write legal contracts that won't stand up in court (speaking as somebody who isn't a lawyer and whose legal advice is worthless), they can. But the point is, I see no ethical nor legal reason why a user can't create a script which is called MANUALLY by the user and does what a browser does, namely send and receive data from websites (which may or may not include Google). And that, it seems to me, is what the Original Poster wanted. -- Steven D'Aprano -- http://mail.python.org/mailman/listinfo/python-list