On Jul 30, 2009, at 11:27 AM, Stefan Walk wrote:
and that there's nothing you can do with Oniguruma that you can't
also practically do with PCRE (to the best of my knowledge),
http://www.geocities.jp/kosako3/oniguruma/doc/RE.txt - Paragraph 8,
example 2 - specifying the nest level for subroutines/back
references doesn't work with pcre. That's one example I know from
the top of my head ...
Granted :). But it would be possible to rewrite that example into a
PCRE regexp, just a bit less efficiently. Subroutine calls *are* one
thing I like about Oniguruma, but not enough to counter all the other
arguments in favor of PCRE. Honestly, that sort of extension to the
syntax verges on kicking regexp out of the domain it was intended for;
function calls are verging on making regexp trivially Turing-complete.
If you need that level of matching, it's probably time to consider
more aggressive string parsing methodology, such as tokenization.
Regexp is basically a VM (at least as most implementations handle it),
and on nontrivial patterns will often be slower than other methods.
Certainly that particular example, matching an XML tag, will never
deal with all the esoteric cases. You can approach perfect matching as
pattern complexity approaches infinite (i.e. lim complexity->inf
correctness = inf), but the pattern is only realistically useful in
specific cases where you can be guaranteed a valid input stream. And
supposing it doesn't match, there's no reasonable way to determine
*why*.
Also, more to the point, Oniguruma hasn't been updated in two years
and counting, which is enough to count it as unmaintained. PCRE is
still seeing releases on a regular basis.
-- Gwynne
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php