Frank Koshti writes: > not always placed in HTML, and even in HTML, they may appear in > strange places, such as <h1 $foo(x=3)>Hello</h1>. My specific issue > is I need to match, process and replace $foo(x=3), knowing that > (x=3) is optional, and the token might appear simply as $foo. > > To do this, I decided to use: > > re.compile('\$\w*\(?.*?\)').findall(mystring) > > the issue with this is it doesn't match $foo by itself, and requires > there to be () at the end.
Adding a ? after the meant-to-be-optional expression would let the regex engine know what you want. You can also separate the mandatory and the optional part in the regex to receive pairs as matches. The test program below prints this: >$foo()$foo(bar=3)$$$foo($)$foo($bar(v=0))etc</htm ('$foo', '') ('$foo', '(bar=3)') ('$foo', '($)') ('$foo', '') ('$bar', '(v=0)') Here is the program: import re def grab(text): p = re.compile(r'([$]\w+)([(][^()]+[)])?') return re.findall(p, text) def test(html): print(html) for hit in grab(html): print(hit) if __name__ == '__main__': test('>$foo()$foo(bar=3)$$$foo($)$foo($bar(v=0))etc</htm') -- http://mail.python.org/mailman/listinfo/python-list