> Od: Zdenek Maxa <[EMAIL PROTECTED]> > Předmět: Re: multiline regular expression (replace) > Datum: 29.5.2007 13:46:32 > ---------------------------------------- > [EMAIL PROTECTED] wrote: > > On May 29, 2:03 am, Zdenek Maxa <[EMAIL PROTECTED]> wrote: > > > >> Hi all, > >> > >> I would like to perform regular expression replace (e.g. removing > >> everything from within tags in a XML file) with multiple-line pattern. > >> How can I do this? > >> > >> where = open("filename").read() > >> multilinePattern = "^<tag> .... <\/tag>$" > >> re.search(multilinePattern, where, re.MULTILINE) > >> > >> Thanks greatly, > >> Zdenek > >> > > > > Why not use an xml package for working with xml files? I'm sure > > they'll handle your multiline tags. > > > > http://effbot.org/zone/element-index.htm > > http://codespeak.net/lxml/ > > > > ~Sean > > > > > > Hi, > > that was merely an example of what I would like to achieve. However, in > general, is there a way for handling multiline regular expressions in > Python, using presumably only modules from distribution like re? > > Thanks, > Zdenek > -- > http://mail.python.org/mailman/listinfo/python-list > > > There shouldn't be any problems matching multiline strings using re (even without flags), there might be some problem with the search pattern, however, especially the "..." part :-) if you are in fact using dots - which don't include newlines in this pattern.
the flag re.M only changes the behaviour of ^ and $ metacharacters, cf. the docs: re.M MULTILINE When specified, the pattern character "^" matches at the beginning of the string and at the beginning of each line (immediately following each newline); and the pattern character "$" matches at the end of the string and at the end of each line (immediately preceding each newline). By default, "^" matches only at the beginning of the string, and "$" only at the end of the string and immediately before the newline (if any) at the end of the string. you may also check the S flag: re.S DOTALL Make the "." special character match any character at all, including a newline; without this flag, "." will match anything except a newline. see http://docs.python.org/lib/node46.html http://docs.python.org/lib/re-syntax.html Vlasta -- http://mail.python.org/mailman/listinfo/python-list