On Fri, 09 Dec 2005 18:02:02 -0800, James Stroud wrote: > Thomas Liesner wrote: >> Hi all, >> >> i am having a textfile which contains a single string with names. >> I want to split this string into its records an put them into a list. >> In "normal" cases i would do something like: >> >> >>>#!/usr/bin/python >>>inp = open("file") >>>data = inp.read() >>>names = data.split() >>>inp.close() >> >> >> The problem is, that the names contain spaces an the records are also >> just seprarated by spaces. The only thing i can rely on, ist that the >> recordseparator is always more than a single whitespace. >> >> I thought of something like defining the separator for split() by using >> a regex for "more than one whitespace". RegEx for whitespace is \s, but >> what would i use for "more than one"? \s+? >> >> TIA, >> Tom > > The one I like best goes like this: > > py> data = "Guido van Rossum Tim Peters Thomas Liesner" > py> names = [n for n in data.split() if n] > py> names > ['Guido', 'van', 'Rossum', 'Tim', 'Peters', 'Thomas', 'Liesner'] > > I think it is theoretically faster (and more pythonic) than using regexes.
Yes, but the correct result would be: ['Guido van Rossum', 'Tim Peters', 'Thomas Liesner'] Your code is short, elegant but wrong. It could also be shorter and more elegant: # your version py> data = "Guido van Rossum Tim Peters Thomas Liesner" py> [n for n in data.split() if n] ['Guido', 'van', 'Rossum', 'Tim', 'Peters', 'Thomas', 'Liesner'] # my version py> data = "Guido van Rossum Tim Peters Thomas Liesner" py> data.split() ['Guido', 'van', 'Rossum', 'Tim', 'Peters', 'Thomas', 'Liesner'] The "if n" in the list comp is superfluous, and without that, the whole list comp is unnecessary. -- Steven. -- http://mail.python.org/mailman/listinfo/python-list