On 8/18/2015 10:25 AM, Neal Becker wrote:
Trying regex 2015.07.19

I'd like to match recursive parenthesized expressions, with groups such that
'(a(b)c)'

Extended regular expressions can only match strings in extended regular languages. General nested expressions are too general for that. You need a context-free parser. You can find them on pypi or write your own, which in this case is quite simple.
---
from xploro.test import ftest  # my personal function test function

io_pairs = (('abc', []), ('(a)', [(0, '(a)')]), ('a(b)c', [(1, '(b)')]),
            ('(a(b)c)', [(0, '(a(b)c)'), (2, '(b)')]),
            ('a(b(cd(e))(f))g', [(1, '(b(cd(e))(f))'), (3, '(cd(e))'),
                                 (6, '(e)'), (10, '(f)')]),)

def parens(text):
    '''Return sorted list of paren tuples for text.

    Paren tuple is start index (for sorting) and substring.
    '''
    opens = []
    parens = set()
    for i, char in enumerate(text):
        if char == '(':
            opens.append(i)
        elif char == ')':
            start = opens.pop()
            parens.add((start, text[start:(i+1)]))
    return sorted(parens)

ftest(parens, io_pairs)
---
all pass


would give
group(0) -> '(a(b)c)'
group(1) -> '(b)'

but that's not what I get

import regex

#r = r'\((?>[^()]|(?R))*\)'
r = r'\(([^()]|(?R))*\)'
#r = r'\((?:[^()]|(?R))*\)'
m = regex.match (r, '(a(b)c)')

  m.groups()
Out[28]: ('c',)



--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to