On Thu, Feb 25, 2016 at 5:50 PM, Steven D'Aprano
<steve+comp.lang.pyt...@pearwood.info> wrote:
>
> # Read a chunk of bytes/characters from an open file.
> def chunkiter(f, delim):
>     buffer = []
>     b = f.read(1)
>     while b:
>         buffer.append(b)
>         if b in delim:
>             yield ''.join(buffer)
>             buffer = []
>         b = f.read(1)
>     if buffer:
>         yield ''.join(buffer)

How bad is it if you over-read? If it's absolutely critical that you
not read anything from the buffer that you shouldn't, then yeah, it's
going to be slow. But if you're never going to read the file using
anything other than this iterator, the best thing to do is to read
more at a time. Simple and naive method:

def chunkiter(f, delim):
    """Don't use [ or ] as the delimiter, kthx"""
    buffer = ""
    b = f.read(256)
    while b:
        buffer += b
        *parts, buffer = re.split("["+delim+"]", buffer)
        yield from parts
    if buffer: yield buffer

How well does that perform?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to