Lukas Wunner added the comment:

Unfortunately this bug was only fixed in urllib2.py but never in urllib.py. 
This keeps biting people in the butt to this very day. Example: 
https://chromiumcodereview.appspot.com/10825107/

The attached patch remedies that and also fixes some more issues:

(1) proxy_bypass handling:
URLopener.open() will route the request to one of the open_*() methods based on 
the type of the *request* url. However, if a proxy is defined, it instead 
routes the request based on the type of the *proxy* url. So far so good. But: 
In open_http(), the code checks if proxy_bypass(realhost) is true and if so it 
modifies the Host header of the outgoing request. This code only works properly 
if the request url is by chance of type "http". If the request url type is e.g. 
"ftp" and the proxy url type is "http" and there's a proxy_bypass defined for 
realhost, things will go awry since the program will try to speak HTTP with 
realhost while it should really speak FTP. (In other words, open_ftp() should 
be used instead of open_http(), the program is stuck in the wrong codepath.) 
Also, proxy_bypass handling is currently only implemented for the proxy url 
type "http" (and not, for instance, "https"). The patch solves this by moving 
the proxy_bypass check to URLopener.open(): If a proxy_bypass is
  defined for realhost, the request is routed based on the request url type and 
not based on the proxy url type.

(2) addinfourl construction:
Upon successful retrieval of the URL, open_http() and open_https() will 
construct an addinfourl object and return that to the caller. The object is 
constructed with hard coded url type "http" / "https". So if for instance the 
request url type is "ftp" and the proxy url type is "http", the addinfourl 
object will contain a url whose type will have magically changed from "ftp" to 
"http".

(3) Superfluous code:
At the beginning of open_http() and open_https(), the program sets "user_passwd 
= None". Directly below is an if-else statement. At the beginning of the else 
block the program again sets "user_passwd = None".

The patch also works with Python 2.6 save for set_tunnel() in httplib.py, which 
was called _set_tunnel() in 2.6.

----------
nosy: +l
Added file: http://bugs.python.org/file31201/issue1424152-py27-urllib.diff

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue1424152>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to