Re: proposal: another file iterator

Jean-Paul Calderone Sun, 15 Jan 2006 18:21:49 -0800

On 15 Jan 2006 16:44:24 -0800, Paul Rubin <"http://phr.cx"@nospam.invalid> 
wrote:
>I find pretty often that I want to loop through characters in a file:
>
>  while True:
>     c = f.read(1)
>     if not c: break
>     ...
>
>or sometimes of some other blocksize instead of 1.  It would sure
>be easier to say something like:
>
>   for c in f.iterbytes(): ...
>
>or
>
>   for c in f.iterbytes(blocksize): ...
>
>this isn't anything terribly advanced but just seems like a matter of
>having the built-in types keep up with language features.  The current
>built-in iterator (for line in file: ...) is useful for text files but
>can potentially read strings of unbounded size, so it's inadvisable for
>arbitrary files.
>
>Does anyone else like this idea?


It's a pretty useful thing to do, but the edge-cases are somewhat complex.  
When I just want the dumb version, I tend to write this:

    for chunk in iter(lambda: f.read(blocksize), ''):
        ...

Which is only very slightly longer than your version.  I would like it even 
more if iter() had been written with the impending doom of lambda in mind, so 
that this would work:

    for chunk in iter('', f.read, blocksize):
        ...

But it's a bit late now.  Anyhow, here are some questions about your 
iterbytes():

  * Would it guarantee the chunks returned were read using a single read?  If 
blocksize were a multiple of the filesystem block size, would it guarantee 
reads on block-boundaries (where possible)?

  * How would it handle EOF?  Would it stop iterating immediately after the 
first short read or would it wait for an empty return?

  * What would the buffering behavior be?  Could one interleave calls to 
.next() on whatever iterbytes() returns with calls to .read() on the file?

Jean-Paul
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: proposal: another file iterator

Reply via email to