Raymond Hettinger: > Regular expressions should do the trick. > >>> stoppattern = '|'.join(map(re.escape, stoplist)) > >>> re.sub(stoppattern, '', mystr)
If the stop words are many (and similar) then that RE can be optimized with a trie-based strategy, like this one called "List": http://search.cpan.org/~dankogai/Regexp-Optimizer-0.15/lib/Regexp/List.pm "List" is used by something more complex called "Optimizer" that's overkill for the OP problem: http://search.cpan.org/~dankogai/Regexp-Optimizer-0.15/lib/Regexp/Optimizer.pm I don't know if a Python module similar to "List" is available, I may write it :-) Bye, bearophile -- http://mail.python.org/mailman/listinfo/python-list