jdonnell wrote:
I've been writing a simple web spider for fun, and I've run into a
problem I can't figure out. The spider hangs (waits for username and
pass) when I hit a page that requires .htaccess authentication.
self.f = urllib.urlopen('http://blogbloc.com/~jay/test/')
#nothing below here gets executed
print self.f.info()
...
It hangs as soon as I call urllib.urlopen(). I was going to try to read
the info and break for pages that require authentication, but it hangs
before I can call self.f.info()
Any ideas?
I tried Google. First I looked for "python urlopen authentication".
I scanned the top page for the word "authentication" and found a
few references, then something called FancyURLOpener. Adding that
to my search, skipping down a couple of links, I quickly found
a page that starts "Here is an explanation about how to handle password
protected sites."
Another approach that often works is to throw in the word
"recipe", hoping perhaps to get a hit in the Python Cookbook
page: try "python http authentication recipe", for example.
I hope that teaches you a bit about how to fish, rather than
just giving you one. ;-)
-Peter
--
http://mail.python.org/mailman/listinfo/python-list