[Licheng Fang] >> Oh, please do have a look at the second link I've posted. There's a >> table comparing the regexp engines. The engines you've tested probably >> all use an NFA implementation.
[Bryan Olson] > Unfortunately, the stuff about NFA's is wrong. Friedl's awful > book Strongly disagree: it's an excellent book about the /pragmatics/ of using "regular expressions" as most widely implemented. It's not at all about "regular expressions" in the CompSci sense of the term, which appears to be your complaint. > was the first time I saw this confusion about what NFA is; > I don't know if he originated the mess or just propagated it. As far as I could tell at the time, he originated it. I'm not happy about that either. > "Nondeterministic finite automata" is well defined in theory > of computation. The set of languages accepted by NFA's is > exactly the same as the set accepted by DFA's. And which has very little to do with "regular expressions" as most widely implemented -- gimmicks like backreferences are wholly outside the DFA formalism. > What Python uses is search-and-backtrack. Unfortunately such > engines don't have much theory behind them, and it's hard to > reason generally about what they do. Yup X 3, and the last is precisely why Friedl's book is valuable for people using "NFA" implementations: Friedl does a good job of explaining when and why you're likely to trigger atrocious runtime performance, and gives practical general techniques for avoiding those problems. None of that has anything to do with CompSci regexps either, but is vital knowledge for people hoping to make happy non-trivial use of Python/Perl/etc regexps. -- http://mail.python.org/mailman/listinfo/python-list