Alejandro Dubrovsky <[EMAIL PROTECTED]> writes: [...] > How does one connect through a proxy which requires basic authorisation? > The following code, stolen from somewhere, fails with a 407: > [...code involving urllib2.ProxyBasicAuthHandler()...] > Can anyone explain me why this fails, or more importantly, code that would > work?
OK, I finally installed squid and had a look at the urllib2 proxy basic auth support (which I've steered clear of for years despite doing quite a bit with urllib2). Seems quite broken. Appears to have been broken back in December 2004, with revision 38092 (note there's a little revision number oddness in the Python SVN repo, BTW: http://mail.python.org/pipermail/python-dev/2005-November/058269.html): --- urllib2.py (revision 38091) +++ urllib2.py (revision 38092) @@ -720,7 +720,10 @@ return self.retry_http_basic_auth(host, req, realm) def retry_http_basic_auth(self, host, req, realm): - user,pw = self.passwd.find_user_password(realm, host) + # TODO(jhylton): Remove the host argument? It depends on whether + # retry_http_basic_auth() is consider part of the public API. + # It probably is. + user, pw = self.passwd.find_user_password(realm, req.get_full_url()) if pw is not None: raw = "%s:%s" % (user, pw) ... That can't be right, can it? With a proxy, you're always authenticating yourself for the whole proxy, and you want to look up (RFC 2617 section 3.2.1). The ProxyBasicAuthHandler subclass dutifully passes in the right thing for the host argument, but AbstractBasicAuthHandler ignores it, which means that it never finds the password -- e.g. if you're trying to connect to python.org through myproxy.com, it'll be looking for a username/password for python.org instead of the needed myproxy.com. Obviously nobody else uses authenticating proxies either, or at least nobody who can be bothered to fix urllib2 :-( A workaround is to supply a stupid HTTPPasswordMgr that always returns the proxy credentials regardless of what the handler asks it for (only tested with a perhaps-broken 2.5 install, since I've broken my 2.4 install): import urllib2 class DumbProxyPasswordMgr: def __init__(self): self.user = self.passwd = None def add_password(self, realm, uri, user, passwd): self.user = user self.passwd = passwd def find_user_password(self, realm, authuri): return self.user, self.passwd proxy_auth_handler = urllib2.ProxyBasicAuthHandler(DumbProxyPasswordMgr()) proxy_handler = urllib2.ProxyHandler({"http": "http://localhost:3128"}) proxy_auth_handler.add_password(None, None, 'john', 'blah') opener = urllib2.build_opener(proxy_handler, proxy_auth_handler) f = opener.open('http://python.org/') print f.read() Yuck, yuck, yuck! I had realised the auth/proxies code in urllib2 was buggy, but... And all those hoops to jump through. Also, if you're using 2.5 SVN HEAD, it seems revision 42133 broke ProxyHandler in an attempt to fix the URL host:post syntax! I'll try to get some fixes in tomorrow so that 2.5 isn't broken (or at least flag the issues to let somebody else fix them), but no promises as usual... John -- http://mail.python.org/mailman/listinfo/python-list