Bugs item #680577, was opened at 2003-02-05 00:22 Message generated for change (Settings changed) made by gbrandl You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=680577&group_id=5470
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: Python 2.3 >Status: Closed >Resolution: Fixed Priority: 5 Submitted By: GaryD (gazzadee) Assigned to: Nobody/Anonymous (nobody) Summary: urllib2 authentication problem Initial Comment: I've found a problem using the authentication in urllib2. When matching up host-names in order to find a password, then putting the protocol in the address makes it seem like a different address. eg... I create a HTTPBasicAuthHandler with a HTTPPasswordMgrWithDefaultRealm, and add the tuple (None, "http://proxy.blah.com:17828", "foo", "bar") to it. I then setup the proxy to use http://proxy.blah.com:17828 (which requires authentication). When I connect, the password lookup fails, because it is trying to find a match for "proxy.blah.com:17828" rather than "http://proxy.blah.com:17828" This problem doesn't exist if I pass "proxy.blah.com:17828" to the password manager. There seems to be some stuff in HTTPPasswordMgr to deal with variations on site names, but I guess it's not working in this case (unless this is intentional). Version Info: Python 2.2 (#1, Feb 24 2002, 16:21:58) [GCC 2.96 20000731 (Mandrake Linux 8.2 2.96-0.76mdk)] on linux-i386 ---------------------------------------------------------------------- >Comment By: Georg Brandl (gbrandl) Date: 2006-05-03 05:33 Message: Logged In: YES user_id=849994 Closing accordingly. ---------------------------------------------------------------------- Comment By: John J Lee (jjlee) Date: 2006-04-15 18:45 Message: Logged In: YES user_id=261020 This issue is fixed by patch 1470846. ---------------------------------------------------------------------- Comment By: John J Lee (jjlee) Date: 2003-12-16 12:49 Message: Logged In: YES user_id=261020 Thanks! It seems .reduce_uri() tries to cope with hostnames as well as absoluteURIs. I don't understand why it wants to do that, but it fails, because it doesn't anticipate what urlparse does when a port is present: >>> urlparse.urlparse("foo.bar.com") ('', '', 'foo.bar.com', '', '', '') >>> urlparse.urlparse("foo.bar.com:80") ('foo.bar.com', '', '80', '', '', '') I haven't checked, but I assume it's just incorrect use of urlparse to pass it a hostname. Of course, if it's "fixed" to only accept absoluteURIs, it will break existing code, so I guess it must be fixed for hostnames. :-(( Also, I think .is_suburi("/foo/spam", "/foo/eggs") should return False, but returns True, and .http_error_40x() use req.get_host() when they should be using req.get_full_url() (from a quick look at RFC 2617). ---------------------------------------------------------------------- Comment By: GaryD (gazzadee) Date: 2003-12-16 03:10 Message: Logged In: YES user_id=693152 Okay, I have attached a file that replicates this problem. If you run it as is (replacing the proxy name and address with something suitable), then it will fail (requiring proxy authentication). If you uncomment line 23 (which specifies the password without the scheme), then it will work successfully. Technical Info: * For a proxy, I am using Squid Cache version 2.4.STABLE7 for i586-mandrake-linux-gnu... * I have replicated the problem with Python 2.2.2 on Linux, and Python 2.3.2 on Windows XP. ---------------------------------------------------------------------- Comment By: GaryD (gazzadee) Date: 2003-12-16 02:08 Message: Logged In: YES user_id=693152 This was a while ago, and my memory has faded. I'll try to respond intelligently. I think the question was with the way the password manager looks up passwords, rather than anything else. I am pretty sure that the problem is not to do with the URI passed to urlopen(). In the code shown below, the problem was solely dependent on whether I added the line: (None, "blah.com:17828", "foo", "bar") ...to the HTTPPasswordMgrWithDefaultRealm object. If that password set was added, then the password lookup for the proxy was successful, and urlopen() worked. If that password set was not included, then the password lookup for the proxy was unsuccessful (despite the inclusion of the other 2, similar, password sets - "http://blah.com:17828" and "blah.com"), and urlopen() would fail. Hence my suspicion that the password manager did not fully remove the scheme, despite attempts to do so. I'll see if I can set it up on the latest python and get it to happen again. Just as an explanation, the situation was that I was running an authenticating proxy on a non-standard port (in order to avoid clashing with the normal proxy), in order to test out how my download code would work through an authenticating proxy. ---------------------------------------------------------------------- Comment By: John J Lee (jjlee) Date: 2003-12-01 00:14 Message: Logged In: YES user_id=261020 The problem seems to be with the port (:17828), not the URL scheme (http:), because HTTPPasswordMgr.reduce_uri() removes the scheme. RFC 2617 (top of page 3) says nothing about removing the port from the URI. urllib2 does not remove the port, so this doesn't appear to be a bug. I guess gazzadee was doing a urlopen with a different canonical root URI (RFC 2617, top of page 3 again) to the one he gave in add_password (ie. the URL he passed to urlopen() had no explicit port number). ---------------------------------------------------------------------- Comment By: GaryD (gazzadee) Date: 2003-02-09 23:17 Message: Logged In: YES user_id=693152 Okay, the same problem crops up in Python 2.2.2 running under cygwin on Win XP Version Info: Python 2.2.2 (#1, Dec 31 2002, 12:24:34) [GCC 3.2 20020927 (prerelease)] on cygwin Here's the pertinent section of my test file (passwords and URL changed to protect the innocent): # Setup proxy proxy_handler = ProxyHandler({"http" : "http://blah.com:17828"}) # Setup authentication pass_mgr = HTTPPasswordMgrWithDefaultRealm() for passwd in [ \ (None, "http://blah.com:17828", "foo", "bar"), \ # (None, "blah.com:17828", "foo", "bar"), \ # Works if this line is uncommented (None, "blah.com", "foo", "bar"), \ ]: print("Adding password set (%s, %s, %s, %s)" % passwd) pass_mgr.add_password(*passwd) auth_handler = HTTPBasicAuthHandler(pass_mgr) proxy_auth_handler = ProxyBasicAuthHandler(pass_mgr) # Now build a new URL opener and install it opener = build_opener(proxy_handler, proxy_auth_handler, auth_handler, HTTPHandler) install_opener(opener) # Now try to open a file and see what happens request = Request("http://www.google.com") try: remotefile = urlopen(request) except HTTPError, ex: print("Unable to download file due to HTTP Error %d (%s)." % (ex.code, ex.msg)) return ---------------------------------------------------------------------- Comment By: Gerhard Häring (ghaering) Date: 2003-02-07 23:21 Message: Logged In: YES user_id=163326 Can you please retry with Python 2.2.2? It seems that a related bug was fixed for 2.2.2: http://python.org/2.2.2/NEWS.txt has an entry: """ - In urllib2.py: fix proxy config with user+pass authentication. [SF patch 527518] """ ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=680577&group_id=5470 _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com