Matthew Barnett <pyt...@mrabarnett.plus.com> added the comment:

The regex module supports nested sets and set operations, eg. 
r"[[a-z]--[aeiou]]" (the letters from 'a' to 'z', except the vowels). This 
means that literal '[' in a set needs to be escaped.

For example, re module sees "[][()]..." as:

    [      start of set
     ]     literal ']'
     [()   literals '[', '(', ')'
    ]      end of set
    ...   ...

but the regex module sees it as:

    [      start of set
     ]     literal ']'
     [()]  nested set [()]
     ...   ...

Thus:

>>> s = u'void foo ( type arg1 [, type arg2 ] )'
>>> regex.sub(r'(?<=[][()]) |(?!,) (?!\[,)(?=[][(),])', '', s)
u'void foo ( type arg1 [, type arg2 ] )'
>>> regex.sub('(?<=[]\[()]) |(?!,) (?!\[,)(?=[]\[(),])', '', s)
u'void foo(type arg1 [, type arg2])'

If it can't parse it as a nested set, it tries again as a non-nested set (like 
re), but there are bound to be regexes where it could be either.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue2636>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to