On Tuesday, January 4, 2011 11:26:48 AM UTC-7, MRAB wrote: > On 04/01/2011 17:11, Jeremy wrote: > > I am trying to write a regular expression that finds and deletes (replaces > > with nothing) comments in a string/file. Comments are defined by the first > > non-whitespace character is a 'c' or a dollar sign somewhere in the line. > > I want to replace these comments with nothing which isn't too hard. The > > trouble is, the comments are replaced with a new-line; or the new-line > > isn't captured in the regular expression. > > > > Below, I have copied a minimal example. Can someone help? > > > > Thanks, > > Jeremy > > > > > > import re > > > > text = """ c > > C - Second full line comment (first comment had no text) > > c Third full line comment > > F44:N 2 $ Inline comments start with dollar sign and go to end of > > line""" > > > > commentPattern = re.compile(""" > > (^\s*?c\s*?.*?| # Comment start with c or C > > \$.*?)$\n # Comment starting with $ > > """, re.VERBOSE|re.MULTILINE|re.IGNORECASE) > > > Part of the problem is that you're not using raw string literals or > doubling the backslashes. > > Try soemthing like this: > > commentPattern = re.compile(r""" > (^[ \t]*c.*\n| # Comment start with c or C > [ \t]*\$.*) # Comment starting with $ > """, re.VERBOSE|re.MULTILINE|re.IGNORECASE)
Using a raw string literal fixed the problem for me. Thanks for the suggestion. Why is that so important? Jeremy -- http://mail.python.org/mailman/listinfo/python-list