Stefano Mazzucco added the comment: Martin, thanks for elaborating my thoughts!
I have dug I bit deeper in Python2's urllib code with pdb, and I think I have narrowed the issue down to what open_http does. In my example code, replacing opener.open(url) with opener.open_http(url) gives the same problem. I realize I did not provide you with the output of the script, so here it is: * Python 2.7.10 python urllib_error.py ('Trying to open', 'https://www.python.org') Traceback (most recent call last): File "urllib_error.py", line 30, in <module> opener.open_http((host, selector)) File "/home/mazzucco/.pyenv/versions/2.7.10/lib/python2.7/urllib.py", line 364, in open_http return self.http_error(url, fp, errcode, errmsg, headers) File "/home/mazzucco/.pyenv/versions/2.7.10/lib/python2.7/urllib.py", line 381, in http_error return self.http_error_default(url, fp, errcode, errmsg, headers) File "/home/mazzucco/.pyenv/versions/2.7.10/lib/python2.7/urllib.py", line 386, in http_error_default raise IOError, ('http error', errcode, errmsg, headers) IOError: ('http error', 501, 'Not Implemented', <httplib.HTTPMessage instance at 0x7f875a67b950>) * Python 3.4.3 python urllib_error.py Trying to open https://www.python.org Traceback (most recent call last): File "urllib_error.py", line 30, in <module> opener.open_http((host, selector)) File "/home/mazzucco/.pyenv/versions/3.4.3/lib/python3.4/urllib/request.py", line 1805, in open_http return self._open_generic_http(http.client.HTTPConnection, url, data) File "/home/mazzucco/.pyenv/versions/3.4.3/lib/python3.4/urllib/request.py", line 1801, in _open_generic_http response.status, response.reason, response.msg, data) File "/home/mazzucco/.pyenv/versions/3.4.3/lib/python3.4/urllib/request.py", line 1821, in http_error return self.http_error_default(url, fp, errcode, errmsg, headers) File "/home/mazzucco/.pyenv/versions/3.4.3/lib/python3.4/urllib/request.py", line 1826, in http_error_default raise HTTPError(url, errcode, errmsg, headers, None) urllib.error.HTTPError: HTTP Error 501: Not Implemented When I unwrap the contents of httplib.HTTPMessage, the error page returned by the squid proxy says: ------------------------------------------------------- ERROR The requested URL could not be retrieved The following error was encountered while trying to retrieve the URL: https://www.python.org Unsupported Request Method and Protocol Squid does not support all request methods for all access protocols. For example, you can not POST a Gopher request. ------------------------------------------------------- Looking at Python2's implementation of URLopener's open_http, I can get an even more minimal failing example limited to httplib: import httplib host = 'proxy.corp.com:8181' # this is not the actual proxy selector = 'https://www.python.org' print("Trying to open", selector) h = httplib.HTTP(host) h.putrequest('GET', selector) h.putheader('User-Agent', 'Python-urllib/1.17') h.endheaders(None) errcode, errmsg, headers = h.getreply() print(errcode, errmsg) print(headers.items()) Running the script on Python 2.7.10 prints: ('Trying to open', 'https://www.python.org') (501, 'Not Implemented') [('content-length', '3069'), ('via', '1.0 proxy.corp.com (squid/3.1.6)'), ('x-cache', 'MISS from proxy.corp.com'), ('content-language', 'en'), ('x-squid-error', 'ERR_UNSUP_REQ 0'), ('x-cache-lookup', 'NONE from proxy.corp.com:8181'), ('vary', 'Accept-Language'), ('server', 'squid/3.1.6'), ('proxy-connection', 'close'), ('date', 'Fri, 10 Jul 2015 09:27:14 GMT'), ('content-type', 'text/html'), ('mime-version', '1.0')] As I said, I found out about this when using buildout to download files over HTTPS. Buildout uses urllib.urlretrieve on Python2 and urllib.request.urlretrieve on Python3. I guess that the latter has been fixed in issue 1424152, so that's why I can download with buildout on Python3. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue24599> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com