Here is a simple function that scans through an input file and groups the lines of the file into sections. Sections start with 'Name:' and end with a blank line. The function yields sections as they are found.
def makeSections(f): currSection = []
for line in f: line = line.strip() if line == 'Name:': # Start of a new section if currSection: yield currSection currSection = [] currSection.append(line)
elif not line: # Blank line ends a section if currSection: yield currSection currSection = []
else: # Accumulate into a section currSection.append(line)
# Yield the last section if currSection: yield currSection
There is some obvious code duplication in the function - this bit is repeated 2.67 times ;-):
if currSection:
yield currSection
currSection = []
You can write:
for section in yieldSection(): yield section
in both places, but I assume you still don't like the code duplication this would create.
How about something like (completely untested):
if line == 'Name:' or not line: if currSection: yield currSection currSection = [] if line == 'Name:' currSection.append(line)
Another consideration: in Python 2.4, itertools has a groupby function that you could probably get some benefit from:
>>> class Sections(object): ... def __init__(self): ... self.is_section = False ... def __call__(self, line): ... if line == 'Name:\n': ... self.is_section = True ... elif line == '\n': ... self.is_section = False ... return self.is_section ... >>> def make_sections(f): ... for _, section in itertools.groupby(f, Sections()): ... result = ''.join(section) ... if result != '\n': ... yield result ... >>> f = 'Name:\nA\nx\ny\nz\n\nName:\nB\na\nb\nc\n'.splitlines(True) >>> list(make_sections(f)) ['Name:\nA\nx\ny\nz\n', 'Name:\nB\na\nb\nc\n'] -- http://mail.python.org/mailman/listinfo/python-list