On Sun, 27 Feb 2011 22:19:18 -0800, Chris Rebert wrote: > On Sun, Feb 27, 2011 at 9:38 PM, monkeys paw <mon...@joemoney.net> > wrote: >> I have a working urlopen routine which opens a url, parses it for <a> >> tags and prints out the links in the page. On some sites, wikipedia for >> instance, i get a >> >> HTTP error 403, forbidden. >> >> What is the difference in accessing the site through a web browser and >> opening/reading the URL with python urllib2.urlopen? [...] > Sidenote: Wikipedia has a proper API for programmatic browsing, likely > hence why it's blocking your program.
What he said. Please don't abuse Wikipedia by screen-scraping it. -- Steven -- http://mail.python.org/mailman/listinfo/python-list