On Jul 10, 12:53 pm, Nobody <nob...@nowhere.com> wrote: > On Thu, 09 Jul 2009 18:36:05 -0700, inkhorn wrote: > > For one of my projects, I came across the need to check if one of many > > items from a list of strings could be found in a long string. > > If you need to match many strings or very long strings against the same > list of items, the following should (theoretically) be optimal: > > r = re.compile('|'.join(map(re.escape,list_items))) > ... > result = r.search(string)
"theoretically optimal" happens only if the search mechanism builds a DFA or similar out of the list of strings. AFAIK Python's re module doesn't. Try this: http://hkn.eecs.berkeley.edu/~dyoo/python/ahocorasick/ -- http://mail.python.org/mailman/listinfo/python-list