Re: Why is regex so slow?

Dave Angel Tue, 18 Jun 2013 19:15:16 -0700

On 06/18/2013 09:51 PM, Steven D'Aprano wrote:

   <SNIP>


Even if the regex engine is just as efficient at doing simple character
matching as `in`, and it probably isn't, your regex tries to match all
eleven characters of "ENQUEUEING" while the `in` test only has to match
three, "ENQ".

The rest of your post was valid, and useful, but there's a misconceptionin this paragraph; I hope you don't mind me pointing it out.

In general, for simple substring searches, you can search for a largestring faster than you can search for a smaller one. I'd expect


if "ENQUEUING" in bigbuffer

to be faster than

if "ENQ"  in bigbuffer

assuming that all occurrences of ENQ will actually match the wholething. If CPython's implementation doesn't show the speed difference,maybe there's some room for optimization.


See Boyer-Moore if you want a peek at the algorithm.

When I was writiing a simple search program, I could typically searchfor a 4-character string faster than REP SCASB could match a onecharacter string. And that's a single instruction (with prefix).

--
DaveA
--
http://mail.python.org/mailman/listinfo/python-list

Re: Why is regex so slow?

Reply via email to