O.R.Senthil Kumaran wrote: > Thank you for the reply, Mr. John and I apologize for a very late response > from my end. > > * John J. Lee <[EMAIL PROTECTED]> [2007-07-06 18:53:09]: > > >>"O.R.Senthil Kumaran" <[EMAIL PROTECTED]> writes: >> >> >>>Hi, >>>There is an Open Tracker item against urllib2 library python.org/sf/735515 >> >>>I am not completely getting what "cache - redirection" implies and what >>>should >>>be done with the urllib2 module. Any pointers? >> >>When a 301 redirect occurs after a request for URL U, via >>urllib2.urlopen(U), urllib2 should remember the result of that >>redirection, viz a second URL, V. Then, when another >>urllib2.urlopen(U) takes place, urllib2 should send an HTTP request >>for V, not U. urllib2 does not currently do this. (Obviously the >>cache -- that is, the dictionary or whatever that stores the mapping >>from URLs U to V -- should not be maintained by function urlopen >>itself. Perhaps it should live on the redirect handler.) >> > > > I spent a little time thinking about a solution and figured out that the > following changes to HTTPRedirectHandler, might be helpful in implementing > this. > > Class HTTPRedirectHandler(BaseHandler): > # ... omitted ... > # Initialize a dictionary to hold cache. > > def __init__(self): > self.cache = {} > > > # Handles 301 errors separately in a different function which maintains a > # maintains cache. > > def http_error_301(self, req, fp, code, msg, headers): > > if req in self.cache: > # Look for loop, if a particular url appears in both key and value > # then there is loop and return HTTPError > if len(set(self.cache.keys()) & set(self.cache.values())) > 0: > raise HTTPError(req.get_full_url(), code, self.inf_msg + msg + > headers, fp) > return self.cache[req] > > self.cache[req] = self.http_error_302(req,fp,code,msg, headers) > return self.cache[req] > > > John, let me know your comments on this approach. > I have not tested this code in real scenario yet with a 301 redirect. > If its okay, I shall test it and submit a patch for the tracker item.
That assumes you're reusing the same object to reopen another URL. Is this thread-safe? That's also an inefficient way to test for an empty dictionary. John Nagle -- http://mail.python.org/mailman/listinfo/python-list