On Fri, May 8, 2009 at 8:53 AM, <pyt...@bdurham.com> wrote: > I'm looking for suggestions on technique (not necessarily code) about the > most pythonic way to normalize vertical whitespace in blocks of text so that > there is never more than 1 blank line between paragraphs. Our source text > has newlines normalized to single newlines (\n vs. combinations of \r and > \n), but there may be leading and trailing whitespace around each newline. >
I'm not sure what's more Pythonic in this case, but the approach I'd use is a generator. Like so: --- code to follow -- def cleaner(body): empty = False for line in body.splitlines(): line = line.rstrip() if line: empty = False yield line else: if empty: continue else: empty = True yield '' Text = """ This is my first paragraph. This is my second. This is my third. This is my fourth. This is my fifth. This is my sixth. This is my seventh. Okay? """ print '\n'.join(cleaner(Text)) --- end code --- You maybe want to tweak the logic a bit depending on how you want to handle initial blank lines or such, and I'm two paragraphs w/o a blank line between them (you talk some of going from \n\n\n to \n\n, so I'm not sure if you want to allow blank lines separating the paragraphs or require it). I guess this basically is your first approach, just specifically going with "Use a generator to do it" instead of regular list building cuz that should be faster and cleaner int his case.... and I don't think the third approach is really all that needed. Python can do this pretty quickly. The second approach seems distinctly unPythonic. I use regular expressions in several places, mind, but if there's a straight-forward clean way to do anything without them that always seems like the Pythonic thing to me. :) But that's just me :) --S
-- http://mail.python.org/mailman/listinfo/python-list