In article <mailman.2062.1369400329.3114.python-l...@python.org>, Malte Forkel <malte.for...@berlin.de> wrote:
> Finding out why a regular expression does not match a given string can > very tedious. I would like to write a utility that identifies the > sub-expression causing the non-match. My idea is to use a parser to > create a tree representing the complete regular expression. Then I could > simplify the expression by dropping sub-expressions one by one from > right to left and from bottom to top until the remaining regex matches. > The last sub-expression dropped should be (part of) the problem. > > As a first step, I am looking for a parser for Python regular > expressions, or a Python regex grammar to create a parser from. > > But may be my idea is flawed? Or a similar (or better) tools already > exists? Any advice will be highly appreciated! I think this would be a really cool tool. The debugging process I've always used is essentially what you describe. I start try progressively shorter sub-patterns until I get a match, then try to incrementally add back little bits of the original pattern until it no longer matches. With luck, the problem will become obvious at that point. Having a tool which automated this would be really useful. Of course, most of Python user community are wimps and shy away from big hairy regexes [ducking and running]. -- http://mail.python.org/mailman/listinfo/python-list