Paul McGuire wrote:
I'd be very interested to see if there actually is a benchmark suite for regexp's. I imagine that this could be an easy area for quite a varied set of results, depending on the expression features included in the actual regexp being tested, and even the nature of the input text. For example, a simple re that just scans for words in a text stream may perform very differently from one that searches for delimited text, has to lookahead for greedy matches, maintains return groups, performs named substitutions, etc.
Without a pretty thorough benchmark suite, I would be dubious of performance claims being very much better than anecdotes.
we did fairly extensive benchmarks when moving from PRE to SRE back in the 1.6 days (partially based on benchmarks developed when moving from REGEX to PRE):
http://mail.python.org/pipermail/python-dev/2000-August/007797.html
(but as you can see, this benchmarking was done to make sure that the new engine didn't slow things down, not to see if SRE was slower or faster than the "competition". feel free to try the microbenchmarks with recent versions of Python and you're favourite non-Python language...)
</F>
-- http://mail.python.org/mailman/listinfo/python-list