In article <mailman.1917.1271357827.23598.python-l...@python.org>, J. Cliff Dyer <j...@sdf.lonestar.org> wrote: >On Thu, 2010-04-15 at 11:25 -0700, koranthala wrote: >> >> Suppose I am doing the following: >> req = urllib2.urlopen('http://www.python.org') >> data = req.read() >> >> When is the actual data received? is it done by the first line? or >> is it done only when req.read() is used? >> My understanding is that when urlopen is done itself, we would have >> received all the data, and req.read() just reads it from the file >> descriptor. >> But, when I read the source code of pylot, it mentioned the >> following: >> resp = opener.open(request) # this sends the HTTP request >> and returns as soon as it is done connecting and sending >> connect_end_time = self.default_timer() >> content = resp.read() >> req_end_time = self.default_timer() >> >> Here, it seems to suggest that the data is received only after you do >> resp.read(), which made me all confused. > >My understanding (please correct me if I'm wrong), is that when you call >open, you send a request to the server, and get a response object back. >The server immediately begins sending data (you can't control when they >send it, once you've requested it). When you call read() on your >response object, it reads all the data it has already received, and if >that amount of data isn't sufficient to handle your read call, it blocks >until it has enough. > >So your opener returns as soon as the request is sent, and read() blocks >if it doesn't have enough data to handle your request.
Close. urlopen() returns after it receives the HTTP header (that's why you can get an HTTP exception on e.g. 404 without the read()). -- Aahz (a...@pythoncraft.com) <*> http://www.pythoncraft.com/ "It is easier to optimize correct code than to correct optimized code." --Bill Harlan -- http://mail.python.org/mailman/listinfo/python-list