On Sun, 22 Jan 2012 07:50:59 -0800, Rick Johnson wrote: > What does Python do when presented with this code? > > py> [line.strip('\n') for line in f.readlines()] > > If Python reads all the file lines first and THEN iterates AGAIN to do > the strip; we are driving a Fred flintstone mobile.
Nonsense. File-like objects offer two APIs: there is a lazy iterator approach, using the file-like object itself as an iterator, and an eager read-it-all-at-once approach, offered by the venerable readlines() method. readlines *deliberately* reads the entire file, and if you as a developer do so by accident, you have no-one to blame but yourself. Only a poor tradesman blames his tools instead of taking responsibility for learning how to use them himself. You should use whichever approach is more appropriate for your situation. You might want to consider reading from the file as quickly as possible, in one big chunk if you can, so you can close it again and let other applications have access to it. Or you might not care. The choice is yours. For small files, readlines() will probably be faster, although for small files it won't make much practical difference. Who cares whether it takes 0.01ms or 0.02ms? For medium sized files, say, a few thousand lines, it could go either way, depending on memory use, the size of the internal file buffer, and implementation details. Only for files large enough that allocating memory for all the lines at once becomes significant will lazy iteration be a clear winner. But if the file is that big, are you sure that a list comprehension is the right tool in the first place? In general, you should not care greatly which of the two you use, unless profiling your application shows that this is the bottleneck. But it is extremely unlikely that copying even a few thousands lines around memory will be slower than reading them from disk in the first place. Unless you expect to be handling truly large files, you've got more important things to optimize before wasting time caring about this. -- Steven -- http://mail.python.org/mailman/listinfo/python-list