Hi, I have a small Python script to fetch some pages from the internet. There are a lot of pages and I am looping through them and then downloading the page using urlretrieve() in the urllib module.
The problem is that after 110 pages or so the script sort of hangs and then I get the following traceback: >>>> Traceback (most recent call last): File "volume_archiver.py", line 21, in <module> urllib.urlretrieve(remotefile,localfile) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/urllib.py", line 89, in urlretrieve return _urlopener.retrieve(url, filename, reporthook, data) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/urllib.py", line 222, in retrieve fp = self.open(url, data) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/urllib.py", line 190, in open return getattr(self, name)(url) File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/urllib.py", line 328, in open_http errcode, errmsg, headers = h.getreply() File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/httplib.py", line 1195, in getreply response = self._conn.getresponse() File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/httplib.py", line 924, in getresponse response.begin() File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/httplib.py", line 385, in begin version, status, reason = self._read_status() File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/httplib.py", line 343, in _read_status line = self.fp.readline() File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/socket.py", line 331, in readline data = recv(1) IOError: [Errno socket error] (54, 'Connection reset by peer') >>>>>> My script code is as follows: ----------------------------------------- import os import urllib volume_number = 149 # The volumes number 150 to 544 while volume_number < 544: volume_number = volume_number + 1 localfile = '/Users/Chris/Desktop/Decisions/' + str(volume_number) + '.html' remotefile = 'http://caselaw.lp.findlaw.com/scripts/getcase.pl? court=us&navby=vol&vol=' + str(volume_number) print 'Getting volume number:', volume_number urllib.urlretrieve(remotefile,localfile) print 'Download complete.' ----------------------------------------- Once I get the error once running the script again doesn't do much good. It usually gets two or three pages and then hangs again. What is causing this? -- http://mail.python.org/mailman/listinfo/python-list