En Sun, 10 Jun 2007 02:54:47 -0300, Erik Max Francis <[EMAIL PROTECTED]>
escribió:
> Gary Herron wrote:
>
>> Certainly there's are cases where xreadlines or read(bytecount) are
>> reasonable, but only if the total pages size is *very* large. But for
>> most web pages, you guys are just nit-pick
Paul Rubin wrote:
> If you know in advance that the page you're retrieving will be
> reasonable in size, then using readlines is fine. If you don't know
> in advance what you're retrieving (e.g. you're working on a crawler)
> you have to assume that you'll hit some very large pages with
> difficu
Gary Herron wrote:
> Certainly there's are cases where xreadlines or read(bytecount) are
> reasonable, but only if the total pages size is *very* large. But for
> most web pages, you guys are just nit-picking (or showing off) to
> suggest that the full read implemented by readlines is wasteful.
Gary Herron <[EMAIL PROTECTED]> writes:
> For simplicity, I'd still suggest my original use of readlines. If
> and when you find you are downloading web pages with sizes that are
> putting a serious strain on your memory footprint, then one of the other
> suggestions might be indicated.
If you
Paul Rubin wrote:
> Erik Max Francis <[EMAIL PROTECTED]> writes:
>
>> This is really wasteful, as there's no point in reading in the whole
>> file before iterating over it. To get the same effect as file
>> iteration in later versions, use the .xreadlines method::
>>
>> for line in aFile.x
Erik Max Francis <[EMAIL PROTECTED]> writes:
> This is really wasteful, as there's no point in reading in the whole
> file before iterating over it. To get the same effect as file
> iteration in later versions, use the .xreadlines method::
>
> for line in aFile.xreadlines():
> ...
Gary Herron wrote:
> So... You must explicitly read the contents of the file-like object
> yourself, and loop through the lines you self. However, fear not --
> it's easy. The socket._fileobject object provides a method "readlines"
> that reads the *entire* contents of the object, and returns a
[EMAIL PROTECTED] wrote:
> Thanks for the reply Larry but I am still having trouble. If i
> understand you correctly, your are just suggesting that i add an http://
> in front of the address? However when i run this:
>
>
import urllib2
site = urllib2.urlopen('http://www.google.com')
Thanks for the reply Larry but I am still having trouble. If i
understand you correctly, your are just suggesting that i add an http://
in front of the address? However when i run this:
>>> import urllib2
>>> site = urllib2.urlopen('http://www.google.com')
>>> for line in site:
>>> print li
[EMAIL PROTECTED] wrote:
> im trying to get urllib2 to work on my server which runs python
> 2.2.1. When i run the following code:
>
>
> import urllib2
> for line in urllib2.urlopen('www.google.com'):
> print line
>
>
> i will always get the error:
> Traceback (most recent call last):
>
10 matches
Mail list logo