Jay wrote: > Let's say, for instance, that one was programming a spell checker or > some other function where the contents of a string from a text-editor's > text box needed to be split so that the resulting array has each word > as an element. Is there a shortcut to do this and, if not, what's the > best and most efficient token group for the split function to achieve > this? >
I'm sure this is not perfect, but it gives one the general idea. py> import re py> rgx = re.compile(r'(?:\s+)|[()\[\].,?;-]+') py> print astr Four score and seven years ago, our forefathers, who art in heaven (hallowed be their names), did forthwith declare that all men are created to shed their mortal coils and to be given daily bread, even in the best of times and the worst of times. With liberty and justice for all. -William Shakespear py> [s for s in rgx.split(astr) if s] ['Four', 'score', 'and', 'seven', 'years', 'ago', 'our', 'forefathers', 'who', 'art', 'in', 'heaven', 'hallowed', 'be', 'their', 'names', 'did', 'forthwith', 'declare', 'that', 'all', 'men', 'are', 'created', 'to', 'shed', 'their', 'mortal', 'coils', 'and', 'to', 'be', 'given', 'daily', 'bread', 'even', 'in', 'the', 'best', 'of', 'times', 'and', 'the', 'worst', 'of', 'times', 'With', 'liberty', 'and', 'justice', 'for', 'all', 'William', 'Shakespear'] James -- James Stroud UCLA-DOE Institute for Genomics and Proteomics Box 951570 Los Angeles, CA 90095 http://www.jamesstroud.com/ -- http://mail.python.org/mailman/listinfo/python-list