goldtech wrote: > Hi, > > Say I have a very big string with a pattern like: > > akakksssk3dhdhdhdbddb3dkdkdkddk3dmdmdmd3dkdkdkdk3asnsn..... > > I want to split the sting into separate parts on the "3" and process > each part separately. I might run into memory limitations if I use > "split" and get a big array(?) I wondered if there's a way I could > read (stream?) the string from start to finish and read what's > delimited by the "3" into a variable, process the smaller string > variable then append/build a new string with the processed data? > > Would I loop it and read it char by char till a "3"...? Or?
You can read the file in chunks: from functools import partial def read_chunks(instream, chunksize=None): if chunksize is None: chunksize = 2**20 return iter(partial(instream.read, chunksize), "") def split_file(instream, delimiter, chunksize=None): leftover = "" chunk = None for chunk in read_chunks(instream): chunk = leftover + chunk parts = chunk.split(delimiter) leftover = parts.pop() for part in parts: yield part if leftover or chunk is None or chunk.endswith(delimiter): yield leftover I hope I got the corner cases right. PS: This has come up before, but I couldn't find the relevant threads... -- http://mail.python.org/mailman/listinfo/python-list