2012/8/18 Frank Koshti <frank.kos...@gmail.com>: > Hey Steven, > > Thank you for the detailed (and well-written) tutorial on this very > issue. I actually learned a few things! Though, I still have > unresolved questions. > > The reason I don't want to use an XML parser is because the tokens are > not always placed in HTML, and even in HTML, they may appear in > strange places, such as <h1 $foo(x=3)>Hello</h1>. My specific issue is > I need to match, process and replace $foo(x=3), knowing that (x=3) is > optional, and the token might appear simply as $foo. > > To do this, I decided to use: > > re.compile('\$\w*\(?.*?\)').findall(mystring) > > the issue with this is it doesn't match $foo by itself, and requires > there to be () at the end. > > Thanks, > Frank > -- > http://mail.python.org/mailman/listinfo/python-list
Hi, Although I don't quite get the pattern you are using (with respect to the specified task), you most likely need raw string syntax for the pattern, e.g.: r"...", instead of "...", or you have to double all backslashes (which should be escaped), i.e. \\w etc. I am likely misunderstanding the specification, as the following: >>> re.sub(r"\$foo\(x=3\)", "bar", "<h1 $foo(x=3)>Hello</h1>") '<h1 bar>Hello</h1>' >>> is probably not the desired output. For some kind of "processing" the matched text, you can use the replace function instead of the replace pattern in re.sub too. see http://docs.python.org/library/re.html#re.sub hth, vbr -- http://mail.python.org/mailman/listinfo/python-list