Re: n00b confusion re: local variable referenced before assignment error

Dave Angel Fri, 19 Jun 2009 12:36:00 -0700

Wells Oliver wrote:

Writing a class which essentially spiders a site and saves the files
locally. On a URLError exception, it sleeps for a second and tries again (on
404 it just moves on). The relevant bit of code, including the offending
method:


class Handler(threading.Thread):
        def __init__(self, url):
                threading.Thread.__init__(self)
                self.url = url

        def save(self, uri, location):
                try:
                        handler = urllib2.urlopen(uri)
                except urllib2.HTTPError, e:
                        if e.code == 404:
                                return
                        else:
                                print "retrying %s (HTTPError)" % uri
                                time.sleep(1)
                                self.save(uri, location)
                except urllib2.URLError, e:
                        print "retrying %s" % uri
                        time.sleep(1)
                        self.save(uri, location)

                if not os.path.exists(os.path.dirname(location)):
                        os.makedirs(os.path.dirname(location))

                file = open(location, "w")
                file.write(handler.read())
                file.close()

...

But what I am seeing is that after a retry (on catching a URLError
exception), I see bunches of "UnboundLocalError: local variable 'handler'
referenced before assignment" errors on line 38, which is the
"file.write(handler.read())" line..

What gives?

Your save() method is recursive in the case of that error, which is apoor excuse for what should have been a loop. You should have someretry depth check, just in case you get hundreds of such errors on thesame site.

But ignoring that issue, the specific symptom is caused when returningfrom the recursive call. You still fall through to the end of yourouter method call, and try to write stuff from that handler (also wipingout the file "location" in the process). Without a handler, that causesan exception. So you should follow those calls to self.save() withreturn's.

Much better would be to write a loop in the first place, and break outof the loop when you succeed. Then the flow is much easier to follow,and you can easily avoid the risk of looping forever, nor of running outof stack space.


Something like (untested):

       def save(self, uri, location):
           for tries in xrange(10):                #try up to 10 times
               try:
                       handler = urllib2.urlopen(uri)
               except urllib2.HTTPError, e:
                       if e.code == 404:
                               break
                       else:
                               print "retrying %s (HTTPError)" % uri
                               time.sleep(1)
                               continue
               except urllib2.URLError, e:
                       print "retrying %s" % uri
                       time.sleep(1)
                       continue

               if not os.path.exists(os.path.dirname(location)):
                       os.makedirs(os.path.dirname(location))

               file = open(location, "w")
               file.write(handler.read())
               file.close()
               break

Other refinements are possible, of course.  For example, if you have any 
cleanup code at the end you may need an additional flag variable to tell 
whether you've succeeded or not.


--
http://mail.python.org/mailman/listinfo/python-list

Re: n00b confusion re: local variable referenced before assignment error

Reply via email to