On 04/01/2011 17:11, Jeremy wrote:
I am trying to write a regular expression that finds and deletes (replaces with 
nothing) comments in a string/file.  Comments are defined by the first 
non-whitespace character is a 'c' or a dollar sign somewhere in the line.  I 
want to replace these comments with nothing which isn't too hard.  The trouble 
is, the comments are replaced with a new-line; or the new-line isn't captured 
in the regular expression.

Below, I have copied a minimal example.  Can someone help?

Thanks,
Jeremy


import re

text = """ c
C - Second full line comment (first comment had no text)
c   Third full line comment
   F44:N 2    $ Inline comments start with dollar sign and go to end of line"""

commentPattern = re.compile("""
     (^\s*?c\s*?.*?|             # Comment start with c or C
     \$.*?)$\n                           # Comment starting with $
     """, re.VERBOSE|re.MULTILINE|re.IGNORECASE)

Part of the problem is that you're not using raw string literals or
doubling the backslashes.

Try soemthing like this:

commentPattern = re.compile(r"""
    (^[ \t]*c.*\n|              # Comment start with c or C
    [ \t]*\$.*)                 # Comment starting with $
    """, re.VERBOSE|re.MULTILINE|re.IGNORECASE)

found = commentPattern.finditer(text)

print("\n\nCard:\n--------------\n%s\n------------------" %text)

if found:
    print("\nI found the following:")
    for f in found: print(f.groups())

else:
    print("\nNot Found")

print("\n\nComments replaced with ''")
replaced = commentPattern.sub('', text)
print("--------------\n%s\n------------------" %replaced)


--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to