New submission from Roy Liu <carso...@gmail.com>:

When testing urllib.request.urlopen in Python 3, I found that it gave empty 
responses for some sites. In other words, reading from the file-like object 
gives zero bytes. Python 2.x's urllib2.urlopen did not give this behavior. I 
isolated the problem down to the following difference:

@@ -1137,8 +1137,6 @@
             r = h.getresponse()  # an HTTPResponse instance
         except socket.error as err:
             raise URLError(err)
-        finally:
-            h.close()
 
         r.url = req.get_full_url()
         # This line replaces the .msg attribute of the HTTPResponse

The "finally" clause is absent in urllib2.py but present in Python 3.2's 
request.py. I think it has something to do with the HTTPConnection being closed 
before data could be read. Still, it's puzzling because some sites still give 
expected answers. Please find attached a small test script for "www.wsj.com" 
for which the response body should be empty without applying the above patch.

----------
components: Extension Modules
files: test.py
messages: 141039
nosy: royliu
priority: normal
severity: normal
status: open
title: urllib.request.urlopen gives empty response bodies for some sites
type: behavior
versions: Python 3.1, Python 3.2
Added file: http://bugs.python.org/file22739/test.py

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue12628>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to