jdonnell wrote:
I've been writing a simple web spider for fun, and I've run into a
problem I can't figure out. The spider hangs (waits for username and
pass) when I hit a page that requires .htaccess authentication.

self.f = urllib.urlopen('http://blogbloc.com/~jay/test/')
#nothing below here gets executed
print self.f.info()
...

It hangs as soon as I call urllib.urlopen(). I was going to try to read
the info and break for pages that require authentication, but it hangs
before I can call self.f.info()

Any ideas?

I tried Google. First I looked for "python urlopen authentication". I scanned the top page for the word "authentication" and found a few references, then something called FancyURLOpener. Adding that to my search, skipping down a couple of links, I quickly found a page that starts "Here is an explanation about how to handle password protected sites."

Another approach that often works is to throw in the word
"recipe", hoping perhaps to get a hit in the Python Cookbook
page: try "python http authentication recipe", for example.

I hope that teaches you a bit about how to fish, rather than
just giving you one. ;-)

-Peter
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to