Re: Seek the one billionth line in a file containing 3 billion lines.

Sullivan WxPyQtKinter Tue, 07 Aug 2007 23:54:22 -0700

On Aug 8, 2:35 am, Paul Rubin <http://[EMAIL PROTECTED]> wrote:
> Sullivan WxPyQtKinter <[EMAIL PROTECTED]> writes:
> > This program:
> > for i in range(1000000000):
> >       f.readline()
> > is absolutely every slow....
>
> There are two problems:
>
>  1) range(1000000000) builds a list of a billion elements in memory,
>     which is many gigabytes and probably thrashing your machine.
>     You want to use xrange instead of range, which builds an iterator
>     (i.e. something that uses just a small amount of memory, and
>     generates the values on the fly instead of precomputing a list).
>
>  2) f.readline() reads an entire line of input which (depending on
>     the nature of the log file) could also be of very large size.
>     If you're sure the log file contents are sensible (lines up to
>     several megabytes shouldn't cause a problem) then you can do it
>     that way, but otherwise you want to read fixed size units.



Thank you for pointing out these two problem. I wrote this program
just to say that how inefficient it is to use a seemingly NATIVE way
to seek a such a big file. No other intention........

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Seek the one billionth line in a file containing 3 billion lines.

Reply via email to