Mensenator said:
c = '0010000110'
c.split('0')
['', '', '1', '', '', '', '11', '']

Ok, the consecutive delimiters appear as empty strings for
reasons unknown (except for the first one). Except when they
start or end the string in which case the first one is included.

Maybe there's a reason for this inconsistent behaviour but you
won't find it in the documentation.

The "reason unknown" is that split() is designed to handle *substrings separated by delimiters*, not *consecutive character runs*. For example, TAB-separated (or if your prefer, COMMA-separated) strings.

In English:

 one<TAB>two<TAB><TAB>four

If you split the above string on the <TAB> character, you really do want to get an empty string among the result substrings, indicating that "column 3" is empty.

In Python:

>>> line = "one\ttwo\t\tfour"
>>> line.split('\t')
['one', 'two', '', 'four']

A result of ['one', 'two', 'four'] would be misleading, no?

-John

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to