Chris Angelico wrote: > On Sun, Jul 17, 2011 at 5:47 PM, Xah Lee <xah...@gmail.com> wrote: >> the problem is to write a script that can check a dir of text files >> (and all subdirs) and reports if a file has any mismatched matching >> brackets. > > I wonder will it be possible to code the whole thing as a single > regular expression... I'm pretty sure it could be done as a sed or awk > script, but I'm insufficiently expert in either to do the job.
Did you notice the excessive crosspost? Please do not feed the troll. In the classical sense is not possible, as classical regular expressions have no concept of recursion. Indeed, matching brackets are a textbook example for a non-regular¹ context-free language L = {a^n b^n; n > 0} that can only be recognized by a pushdown automaton (PDA). (Finite automata "cannot count".) However, it is possible (and done) to use classical regular expressions or non-recursive Regular Expressions (note the different character case) to match tokens more efficiently with a PDA implementation. This is commonly called a parser. (Programming languages tend to be specified in terms of a context-free grammar – they tend to be context-free languages –, which is why a parser is a part of a compiler or interpreter. See for example Python.²) It is possible, with Perl-compatible Regular Expressions (PCRE), provided that you have enough memory, to use such an extended Regular Expression (not to be confused with EREs³)⁴: \((([^()]*|(?R))*)\) However, even Python 3.2 does not support those expressions (although it supports some other PCRE patterns, like named subexpressions)⁵, neither do standard and forked versions of sed(1) (BREs, EREs, using an NFA) nor awk (EREs, using a DFA or NFA). [That is not to say it would not be possible with Python, or sed or awk (both of which are off-topic here), but that more than a Regular Expression would be required.] On this subject, I recommend reading the preview chapters of the second and third editions, respectively, of Jeffrey E. F. Friedl's "Mastering Regular Expressions", which are available online for free at O'Reilly.com⁶. HTH. ______ ¹ because it can be proved that the pumping lemma for regular languages does not apply to it; see also <http://en.wikipedia.org/wiki/Chomsky_hierarchy> pp. ² <http://docs.python.org/reference/> ³ <http://en.wikipedia.org/wiki/Regular_expression> ⁴ Cf. <http://php.net/manual/en/regexp.reference.recursive.php> ⁵ <http://docs.python.org/library/re.html> ⁶ <http://oreilly.com/catalog/regex/chapter/ch04.html>, <http://oreilly.com/catalog/9780596528126/preview#preview> -- PointedEars Bitte keine Kopien per E-Mail. / Please do not Cc: me. -- http://mail.python.org/mailman/listinfo/python-list