"jwaixs" <[EMAIL PROTECTED]> wrote: > Thank you for your replies, it's much obvious now. I know more what I > can and can't do with the re module. But is it possible to search for > more than one string in the same line? > > bv. I want to replace the <python> with " " > </python> with "\n" and every thing that's not between the two python > tags must begin with "\nprint \"\"\"" and end with "\"\"\"\n"? Or do I > need more than one call?
You can do it in one call, but it's ugly; as other have told you already, use HTMLParser or some other parsing package. Now if you insist... regex = re.compile(r'''(?: (?:<python>) (.*?) # group 1: inside tags (?:</python>) ) | # OR ([^<]*) # group 2: outside tags ''', re.DOTALL | re.VERBOSE) def replace(match): g1,g2 = match.groups() if g1: return g1 else: return '\nprint """%s"""\n' % g2 text = '''this is <python>a stupid sentence</python> but still I <python>have to</python> write it.''' print regex.sub(replace,text) ===== Output ================== print """this is """ a stupid sentence print """ but still I """ have to print """ write it.""" ======================= George -- http://mail.python.org/mailman/listinfo/python-list