On Jul 30, 2009, at 11:27 AM, Stefan Walk wrote:
and that there's nothing you can do with Oniguruma that you can't also practically do with PCRE (to the best of my knowledge),
http://www.geocities.jp/kosako3/oniguruma/doc/RE.txt - Paragraph 8, example 2 - specifying the nest level for subroutines/back references doesn't work with pcre. That's one example I know from the top of my head ...


Granted :). But it would be possible to rewrite that example into a PCRE regexp, just a bit less efficiently. Subroutine calls *are* one thing I like about Oniguruma, but not enough to counter all the other arguments in favor of PCRE. Honestly, that sort of extension to the syntax verges on kicking regexp out of the domain it was intended for; function calls are verging on making regexp trivially Turing-complete. If you need that level of matching, it's probably time to consider more aggressive string parsing methodology, such as tokenization. Regexp is basically a VM (at least as most implementations handle it), and on nontrivial patterns will often be slower than other methods.

Certainly that particular example, matching an XML tag, will never deal with all the esoteric cases. You can approach perfect matching as pattern complexity approaches infinite (i.e. lim complexity->inf correctness = inf), but the pattern is only realistically useful in specific cases where you can be guaranteed a valid input stream. And supposing it doesn't match, there's no reasonable way to determine *why*.

Also, more to the point, Oniguruma hasn't been updated in two years and counting, which is enough to count it as unmaintained. PCRE is still seeing releases on a regular basis.

-- Gwynne


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to