On Sat, Feb 27, 2016 at 8:49 PM, Steven D'Aprano <st...@pearwood.info> wrote: > On Thu, 25 Feb 2016 06:30 pm, Chris Angelico wrote: > >> On Thu, Feb 25, 2016 at 5:50 PM, Steven D'Aprano >> <steve+comp.lang.pyt...@pearwood.info> wrote: >>> >>> # Read a chunk of bytes/characters from an open file. >>> def chunkiter(f, delim): >>> buffer = [] >>> b = f.read(1) >>> while b: >>> buffer.append(b) >>> if b in delim: >>> yield ''.join(buffer) >>> buffer = [] >>> b = f.read(1) >>> if buffer: >>> yield ''.join(buffer) >> >> How bad is it if you over-read? > > Pretty bad :-) > > Ideally, I'd rather not over-read at all. I'd like the user to be able to > swap from "read N bytes" to "read to the next delimiter" (and possibly > even "read the next line") without losing anything.
If those are the *only* two operations, you should be able to maintain your own buffer. Something like this: class ChunkIter: def __init__(self, f, delim): self.f = f self.delim = re.compile("["+delim+"]") self.buffer = "" def read_to_delim(self): """Return characters up to the next delim, or remaining chars, or "" if at EOF""" while "delimiter not found": *parts, self.buffer = self.delim.split(self.buffer, 1) if parts: return parts[0] b = self.f.read(256) if not b: return self.buffer self.buffer += b def read(self, nbytes): need = nbytes - len(self.buffer) if need > 0: self.buffer += self.f.read(need) ret, self.buffer = self.buffer[:need], self.buffer[need:] return ret It still might over-read from the underlying file, but those extra chars will be available to the read(N) function. ChrisA -- https://mail.python.org/mailman/listinfo/python-list