Finding out why a regular expression does not match a given string can very tedious. I would like to write a utility that identifies the sub-expression causing the non-match. My idea is to use a parser to create a tree representing the complete regular expression. Then I could simplify the expression by dropping sub-expressions one by one from right to left and from bottom to top until the remaining regex matches. The last sub-expression dropped should be (part of) the problem.
As a first step, I am looking for a parser for Python regular expressions, or a Python regex grammar to create a parser from. But may be my idea is flawed? Or a similar (or better) tools already exists? Any advice will be highly appreciated! Malte -- http://mail.python.org/mailman/listinfo/python-list