cab...@gmail.com wrote: > Hi, > > I have been using Java/Perl professionally for many years and have been > trying to learn python3 recently. As my first program, I tried writing a > class for a small project, and I am having really hard time understanding > exception handling in urllib and in python in general... Basically, what I > want to do is very simple, try to fetch something > "tryurllib.request.urlopen(request)", and: > - If request times out or connection is reset, re-try n times > - If it fails, return an error > - If it works return the content. > > But, this simple requirement became a nightmare for me. I am really > confused about how I should be checking this because: > - When connection times out, I sometimes get URLException with "reason" > field set to socket.timeout, and checking (isinstance(exception.reason, > socket.timeout)) works fine - But sometimes I get socket.timeout > exception directly, and it has no "reason" field, so above statement > fails, since there is no reason field there. - Connection reset is a > totally different exception - Not to mention, some exceptions have msg / > reason / errno fields but some don't, so there is no way of knowing > exception details unless you check them one by one. The only common > thing I could was to find call __str__()? - Since, there are too many > possible exceptions, you need to catch BaseException (I received > URLError, socket.timeout, ConnectionRefusedError, ConnectionResetError, > BadStatusLine, and none share a common parent). And, catching the top > level exception is not a good thing. > > So, I ended up writing the following, but from everything I know, this > looks really ugly and wrong??? > > try: > response = urllib.request.urlopen(request) > content = response.read() > except BaseException as ue: > if (isinstance(ue, socket.timeout) or (hasattr(ue, "reason") > and isinstance(ue.reason, socket.timeout)) or isinstance(ue, > ConnectionResetError)): > print("REQUEST TIMED OUT") > > or, something like: > > except: > (a1,a2,a3) = sys.exc_info() > errorString = a2.__str__() > if ((errorString.find("Connection reset by peer") >= 0) or > (errorString.find("error timed out") >= 0)): > > Am I missing something here? I mean, is this really how I should be doing > it?
Does it help if you reorganize your code a bit? For example: def read_content(request) try: response = urllib.request.urlopen(request) content = response.read() except socket.timeout: return None except URLError as ue: if isinstance(ue.reason, socket.timeout): return None raise return content for i in range(max_tries): content = read_content(request) if content is not None: break else: print("Could not download", request) Instead of returning an out-of-band response (None) you could also raise a custom exception (called MyTimeoutError below). The retry-loop would then become for i in range(max_tries): try: content = read_content(request): except MyTimeoutError: pass else: break else: print("Could not download", request) -- http://mail.python.org/mailman/listinfo/python-list