Re: Basic file operation questions

Peter Otten Thu, 03 Feb 2005 13:50:05 -0800

Caleb Hattingh wrote:

>> Yes, you can even write
>>
>> f = open("data.txt")
>> for line in f:
>>     # do stuff with line
>> f.close()
>>
>> This has the additional benefit of not slurping in the entire file at
>> once.
> 
> Is there disk access on every iteration?   I'm guessing yes?  It shouldn't
> be an issue in the vast majority of cases, but I'm naturally curious :)


Well, you will hardly find an OS that does no buffering of disk access --
but file.next() does some extra optimization as Steven already explained.
Here are some timings performed on the file that has the first-hand
information about Python's file buffering strategy :-)

$ python2.4 -m timeit 'for line in file("fileobject.c"): pass'
1000 loops, best of 3: 528 usec per loop
$ python2.4 -m timeit 'for line in file("fileobject.c").readlines(): pass'
1000 loops, best of 3: 635 usec per loop
$ python2.4 -m timeit 'for line in iter(file("fileobject.c").readline, ""):
pass'
1000 loops, best of 3: 1.59 msec per loop
$ python2.4 -m timeit 'f = file("fileobject.c")' 'while 1:' '  if not
f.readline(): break'
100 loops, best of 3: 2.08 msec per loop

So not only is

for line in file(...):
   # do stuff

the most elegant, it is also the fastest. file.readlines() comes close, but
is only viable for "small" files.

Peter

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Basic file operation questions

Reply via email to