On 2/20/07, Nathan <[EMAIL PROTECTED]> wrote: > On 2/19/07, Gabriel Genellina <[EMAIL PROTECTED]> wrote: > > En Mon, 19 Feb 2007 21:50:11 -0300, GiBo <[EMAIL PROTECTED]> escribió: > > > > > Grant Edwards wrote: > > >> On 2007-02-19, GiBo <[EMAIL PROTECTED]> wrote: > > >>> > > >>> Classic situation - I have to process an input stream of unknown length > > >>> until a I reach its end (EOF, End Of File). How do I check for EOF? The > > >>> input stream can be anything from opened file through sys.stdin to a > > >>> network socket. And it's binary and potentially huge (gigabytes), thus > > >>> "for line in stream.readlines()" isn't really a way to go. > > >>> > > >>> For now I have roughly: > > >>> > > >>> stream = sys.stdin > > >>> while True: > > >>> data = stream.read(1024) > > >> if len(data) == 0: > > >> break #EOF > > >>> process_data(data) > > > > > > Right, not a big difference though. Isn't there a cleaner / more > > > intuitive way? Like using some wrapper objects around the streams or > > > something? > > > > Read the documentation... For a true file object: > > read([size]) ... An empty string is returned when EOF is encountered > > immediately. > > All the other "file-like" objects (like StringIO, socket.makefile, etc) > > maintain this behavior. > > So this is the way to check for EOF. If you don't like how it was spelled, > > try this: > > > > if data=="": break > > > > If your data is made of lines of text, you can use the file as its own > > iterator, yielding lines: > > > > for line in stream: > > process_line(line) > > > > -- > > Gabriel Genellina > > > > -- > > http://mail.python.org/mailman/listinfo/python-list > > > > Not to beat a dead horse, but I often do this: > > data = f.read(bufsize): > while data: > # ... process data. > data = f.read(bufsize) > > > -The only annoying bit it the duplicated line. I find I often follow > this pattern, and I realize python doesn't plan to have any sort of > do-while construct, but even still I prefer this idiom. What's the > concensus here? > > What about creating a standard binary-file iterator: > > def blocks_of(infile, bufsize = 1024): > data = infile.read(bufsize) > if data: > yield data > > > -the use would look like this: > > for block in blocks_of(myfile, bufsize = 2**16): > process_data(block) # len(block) <= bufsize... >
(ahem), make that iterator something that works, like: def blocks_of(infile, bufsize = 1024): data = infile.read(bufsize) while data: yield data data = infile.read(bufsize) -- http://mail.python.org/mailman/listinfo/python-list