On 2/19/07, Gabriel Genellina <[EMAIL PROTECTED]> wrote: > En Mon, 19 Feb 2007 21:50:11 -0300, GiBo <[EMAIL PROTECTED]> escribió: > > > Grant Edwards wrote: > >> On 2007-02-19, GiBo <[EMAIL PROTECTED]> wrote: > >>> > >>> Classic situation - I have to process an input stream of unknown length > >>> until a I reach its end (EOF, End Of File). How do I check for EOF? The > >>> input stream can be anything from opened file through sys.stdin to a > >>> network socket. And it's binary and potentially huge (gigabytes), thus > >>> "for line in stream.readlines()" isn't really a way to go. > >>> > >>> For now I have roughly: > >>> > >>> stream = sys.stdin > >>> while True: > >>> data = stream.read(1024) > >> if len(data) == 0: > >> break #EOF > >>> process_data(data) > > > > Right, not a big difference though. Isn't there a cleaner / more > > intuitive way? Like using some wrapper objects around the streams or > > something? > > Read the documentation... For a true file object: > read([size]) ... An empty string is returned when EOF is encountered > immediately. > All the other "file-like" objects (like StringIO, socket.makefile, etc) > maintain this behavior. > So this is the way to check for EOF. If you don't like how it was spelled, > try this: > > if data=="": break > > If your data is made of lines of text, you can use the file as its own > iterator, yielding lines: > > for line in stream: > process_line(line) > > -- > Gabriel Genellina > > -- > http://mail.python.org/mailman/listinfo/python-list >
Not to beat a dead horse, but I often do this: data = f.read(bufsize): while data: # ... process data. data = f.read(bufsize) -The only annoying bit it the duplicated line. I find I often follow this pattern, and I realize python doesn't plan to have any sort of do-while construct, but even still I prefer this idiom. What's the concensus here? What about creating a standard binary-file iterator: def blocks_of(infile, bufsize = 1024): data = infile.read(bufsize) if data: yield data -the use would look like this: for block in blocks_of(myfile, bufsize = 2**16): process_data(block) # len(block) <= bufsize... -- http://mail.python.org/mailman/listinfo/python-list