New submission from Roy Liu <carso...@gmail.com>: When testing urllib.request.urlopen in Python 3, I found that it gave empty responses for some sites. In other words, reading from the file-like object gives zero bytes. Python 2.x's urllib2.urlopen did not give this behavior. I isolated the problem down to the following difference:
@@ -1137,8 +1137,6 @@ r = h.getresponse() # an HTTPResponse instance except socket.error as err: raise URLError(err) - finally: - h.close() r.url = req.get_full_url() # This line replaces the .msg attribute of the HTTPResponse The "finally" clause is absent in urllib2.py but present in Python 3.2's request.py. I think it has something to do with the HTTPConnection being closed before data could be read. Still, it's puzzling because some sites still give expected answers. Please find attached a small test script for "www.wsj.com" for which the response body should be empty without applying the above patch. ---------- components: Extension Modules files: test.py messages: 141039 nosy: royliu priority: normal severity: normal status: open title: urllib.request.urlopen gives empty response bodies for some sites type: behavior versions: Python 3.1, Python 3.2 Added file: http://bugs.python.org/file22739/test.py _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue12628> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com