patrick.waldo wrote: > manipulation? Also, I conceptually get it, but would you mind walking > me through
>> for key, group in groupby(instream, unicode.isspace): >> if not key: >> yield "".join(group) itertools.groupby() splits a sequence into groups with the same key; e. g. to group names by their first letter you'd do the following: >>> def first_letter(s): return s[:1] ... >>> for key, group in groupby(["Anne", "Andrew", "Bill", "Brett", "Alex"], >>> first_letter): ... print "--- %s ---" % key ... for item in group: ... print item ... --- A --- Anne Andrew --- B --- Bill Brett --- A --- Alex Note that there are two groups with the same initial; groupby() considers only consecutive items in the sequence for the same group. In your case the sequence are the lines in the file, converted to unicode strings -- the key is a boolean indicating whether the line consists entirely of whitespace or not, >>> u"\n".isspace() True >>> u"alpha\n".isspace() False but I call it slightly differently, as an unbound method: >>> unicode.isspace(u"alpha\n") False This is only possible because all items in the sequence are known to be unicode instances. So far we have, using a list instead of a file: >>> instream = [u"alpha\n", u"beta\n", u"\n", u"gamma\n", u"\n", u"\n", >>> u"delta\n"] >>> for key, group in groupby(instream, unicode.isspace): ... print "--- %s ---" % key ... for item in group: ... print repr(item) ... --- False --- u'alpha\n' u'beta\n' --- True --- u'\n' --- False --- u'gamma\n' --- True --- u'\n' u'\n' --- False --- u'delta\n' As you see, groups with real data alternate with groups that contain only blank lines, and the key for the latter is True, so we can skip them with if not key: # it's not a separator group yield group As the final refinement we join all lines of the group into a single string >>> "".join(group) u'alpha\nbeta\n' and that's it. Peter -- http://mail.python.org/mailman/listinfo/python-list