On Aug 18, 12:22 pm, Jussi Piitulainen <jpiit...@ling.helsinki.fi> wrote: > Frank Koshti writes: > > not always placed in HTML, and even in HTML, they may appear in > > strange places, such as <h1 $foo(x=3)>Hello</h1>. My specific issue > > is I need to match, process and replace $foo(x=3), knowing that > > (x=3) is optional, and the token might appear simply as $foo. > > > To do this, I decided to use: > > > re.compile('\$\w*\(?.*?\)').findall(mystring) > > > the issue with this is it doesn't match $foo by itself, and requires > > there to be () at the end. > > Adding a ? after the meant-to-be-optional expression would let the > regex engine know what you want. You can also separate the mandatory > and the optional part in the regex to receive pairs as matches. The > test program below prints this: > > >$foo()$foo(bar=3)$$$foo($)$foo($bar(v=0))etc</htm > > ('$foo', '') > ('$foo', '(bar=3)') > ('$foo', '($)') > ('$foo', '') > ('$bar', '(v=0)') > > Here is the program: > > import re > > def grab(text): > p = re.compile(r'([$]\w+)([(][^()]+[)])?') > return re.findall(p, text) > > def test(html): > print(html) > for hit in grab(html): > print(hit) > > if __name__ == '__main__': > test('>$foo()$foo(bar=3)$$$foo($)$foo($bar(v=0))etc</htm')
You read my mind. I didn't even know that's possible. Thank you- -- http://mail.python.org/mailman/listinfo/python-list