Jeremy Bowers wrote: > On Fri, 01 Apr 2005 14:20:51 -0800, RickMuller wrote: > > > I'm trying to split a string into pieces on whitespace, but I want to > > save the whitespace characters rather than discarding them. > > > > For example, I want to split the string '1 2' into ['1',' ','2']. > > I was certain that there was a way to do this using the standard string > > functions, but I just spent some time poring over the documentation > > without finding anything. > > importPython 2.3.5 (#1, Mar 3 2005, 17:32:12) > [GCC 3.4.3 (Gentoo Linux 3.4.3, ssp-3.4.3-0, pie-8.7.6.6)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> import re > >>> whitespaceSplitter = re.compile("(\w+)") > >>> whitespaceSplitter.split("1 2 3 \t\n5") > ['', '1', ' ', '2', ' ', '3', ' \t\n', '5', ''] > >>> whitespaceSplitter.split(" 1 2 3 \t\n5 ") > [' ', '1', ' ', '2', ' ', '3', ' \t\n', '5', ' '] > > Note the null strings at the beginning and end if there are no instances > of the split RE at the beginning or end. Pondering the second invocation > should show why they are there, though darned if I can think of a good way > to put it into words.
If you don't want any null strings at the beginning or the end, an equivalent regexp is: >>> whitespaceSplitter_2 = re.compile("\w+|\s+") >>> whitespaceSplitter_2.findall("1 2 3 \t\n5") ['1', ' ', '2', ' ', '3', ' \t\n', '5'] >>> whitespaceSplitter_2.findall(" 1 2 3 \t\n5 ") [' ', '1', ' ', '2', ' ', '3', ' \t\n', '5', ' '] George -- http://mail.python.org/mailman/listinfo/python-list