En Mon, 13 Jul 2009 10:11:09 -0300, denis <denis-bz...@t-online.de>
escribió:
Matt, how many words are you looking for, in how long a string ?
Were you able to time any( substr in long_string ) against re.compile
( "|".join( list_items )) ?
There is a known algorithm to solve specifically this problem
(Aho-Corasick), a good implementation should perform better than R.E. (and
better than the gen.expr. with the advantage of returning WHICH string
matched)
There is a C extension somewhere implementing Aho-Corasick.
--
Gabriel Genellina
--
http://mail.python.org/mailman/listinfo/python-list