Re: urllib2 - iteration over non-sequence

2007-06-09 Thread Gabriel Genellina
En Sun, 10 Jun 2007 02:54:47 -0300, Erik Max Francis <[EMAIL PROTECTED]> escribió: > Gary Herron wrote: > >> Certainly there's are cases where xreadlines or read(bytecount) are >> reasonable, but only if the total pages size is *very* large. But for >> most web pages, you guys are just nit-pick

Re: urllib2 - iteration over non-sequence

2007-06-09 Thread Erik Max Francis
Paul Rubin wrote: > If you know in advance that the page you're retrieving will be > reasonable in size, then using readlines is fine. If you don't know > in advance what you're retrieving (e.g. you're working on a crawler) > you have to assume that you'll hit some very large pages with > difficu

Re: urllib2 - iteration over non-sequence

2007-06-09 Thread Erik Max Francis
Gary Herron wrote: > Certainly there's are cases where xreadlines or read(bytecount) are > reasonable, but only if the total pages size is *very* large. But for > most web pages, you guys are just nit-picking (or showing off) to > suggest that the full read implemented by readlines is wasteful.

Re: urllib2 - iteration over non-sequence

2007-06-09 Thread Paul Rubin
Gary Herron <[EMAIL PROTECTED]> writes: > For simplicity, I'd still suggest my original use of readlines. If > and when you find you are downloading web pages with sizes that are > putting a serious strain on your memory footprint, then one of the other > suggestions might be indicated. If you

Re: urllib2 - iteration over non-sequence

2007-06-09 Thread Gary Herron
Paul Rubin wrote: > Erik Max Francis <[EMAIL PROTECTED]> writes: > >> This is really wasteful, as there's no point in reading in the whole >> file before iterating over it. To get the same effect as file >> iteration in later versions, use the .xreadlines method:: >> >> for line in aFile.x

Re: urllib2 - iteration over non-sequence

2007-06-09 Thread Paul Rubin
Erik Max Francis <[EMAIL PROTECTED]> writes: > This is really wasteful, as there's no point in reading in the whole > file before iterating over it. To get the same effect as file > iteration in later versions, use the .xreadlines method:: > > for line in aFile.xreadlines(): > ...

Re: urllib2 - iteration over non-sequence

2007-06-09 Thread Erik Max Francis
Gary Herron wrote: > So... You must explicitly read the contents of the file-like object > yourself, and loop through the lines you self. However, fear not -- > it's easy. The socket._fileobject object provides a method "readlines" > that reads the *entire* contents of the object, and returns a

Re: urllib2 - iteration over non-sequence

2007-06-09 Thread Gary Herron
[EMAIL PROTECTED] wrote: > Thanks for the reply Larry but I am still having trouble. If i > understand you correctly, your are just suggesting that i add an http:// > in front of the address? However when i run this: > > import urllib2 site = urllib2.urlopen('http://www.google.com')

Re: urllib2 - iteration over non-sequence

2007-06-09 Thread rplobue
Thanks for the reply Larry but I am still having trouble. If i understand you correctly, your are just suggesting that i add an http:// in front of the address? However when i run this: >>> import urllib2 >>> site = urllib2.urlopen('http://www.google.com') >>> for line in site: >>> print li

Re: urllib2 - iteration over non-sequence

2007-06-09 Thread Larry Bates
[EMAIL PROTECTED] wrote: > im trying to get urllib2 to work on my server which runs python > 2.2.1. When i run the following code: > > > import urllib2 > for line in urllib2.urlopen('www.google.com'): > print line > > > i will always get the error: > Traceback (most recent call last): >