Re: Generating text from a regular expression

Gabriel Genellina Wed, 31 Mar 2010 05:01:10 -0700

En Wed, 31 Mar 2010 07:49:14 -0300, Nathan Harmston<iwanttobeabad...@googlemail.com> escribió:

I have a slightly complicated/medium sized regular expression and I
want to generate all possible words that it can match (to compare
performance of regex against an acora based matcher). Using the
regular expression as a grammar to generate all words in its language.
I was wondering if this possible in Python or possible using anything.
Google doesnt seem to give any obvious answers.


I've done this some time ago.

This recipehttp://code.activestate.com/recipes/577041-merge-multiple-potentially-infinite-sorted-inputs-/provides an infinite merge operation, required for effectively enumeratinga regular language (with shorter strings first, else 'a*b' would get stuckrepeating "a"s and never yielding any "b").The example at the end shows the basics of how to use it (by hand) in avery simple case.To enumerate all strings matching '(a|bc)*' one should invokeclosure(product(['a'],['b','c'])), which gives:


'', 'a', 'aa', 'bc',
'aaa', 'abc', 'bca',
'aaaa', 'aabc', 'abca', 'bcaa', 'bcbc',
'aaaaa', 'aaabc', 'aabca', 'abcaa', 'abcbc', 'bcaaa', 'bcabc', 'bcbca',
...

Converting the former expression into the later function calls requires aparser (not shown -- and I won't be able to find it until Friday)


--
Gabriel Genellina

--
http://mail.python.org/mailman/listinfo/python-list

Re: Generating text from a regular expression

Reply via email to