Raymond Hettinger, maybe it can be useful to add an optional argument flag to tell such split_on to keep the separators or not? This is the xsplit I usually use:
def xsplit(seq, key=bool, keepkeys=True): """xsplit(seq, key=bool, keepkeys=True): given an iterable seq and a predicate key, splits the iterable where key(item) is True and yields the parts as lists. If keepkeys is True then the splitting items are kept at the beginning of the sublists (but the first sublist may miss the key item). >>> list(xsplit([])) [] >>> key = lambda x: 0x80 & x >>> l = [1,2,3,0xF0,4,5,6,0xF1,7,8,0xF2,9,10,11,12,13] >>> list(xsplit(l, key=key)) [[1, 2, 3], [240, 4, 5, 6], [241, 7, 8], [242, 9, 10, 11, 12, 13]] >>> l = [0xF0,1,2,3,0xF0,4,5,6,0xF1,7,8,0xF2,9,10,11,12,13,0xF0,14,0xF1] >>> list(xsplit(l, key=key, keepkeys=False)) [[1, 2, 3], [4, 5, 6], [7, 8], [9, 10, 11, 12, 13], [14]] >>> s1 = "100001000101100001000000010000" >>> ["".join(map(str, g)) for g in xsplit(s1, key=int)] ['10000', '1000', '10', '1', '10000', '10000000', '10000'] >>> from itertools import groupby # To compare against groupby >>> s2 = "1111100011111100011100101011111" >>> ["".join(map(str, g)) for h, g in groupby(s2, key=int)] ['11111', '000', '111111', '000', '111', '00', '1', '0', '1', '0', '11111'] """ group = [] for el in seq: if key(el): if group: yield group group = [] if keepkeys: group.append(el) else: group.append(el) if group: yield group Maybe it's better to separate or denote the separators in some way? A possibility: "X1X23X456X" => "X", "1", "X", "23", "X", "456", "X" Another possibility: "X1X23X456X" => ("", "X"), ("1", "X"), (["2", "3"], "X"), (["4", "5", "6"], "X") Another possibility (True == is a separator): "X1X23X456X" => (True, "X"), (False, ["1"]), (True, "X"), (False, ["2", "3"]), (True, "X"), (False, ["4", "5", "6"]), (True, "X") Is it useful to merge successive separators (notice two X)? "X1X23XX456X" => (True, ["X"]), (False, ["1"]), (True, ["X"]), (False, ["2", "3"]), (True, ["X", "X"]), (False, ["4", "5", "6"]), (True, ["X"]) Opps, this is groupby :-) Is a name like isplitter or splitter better this itertool? Bye, bearophile -- http://mail.python.org/mailman/listinfo/python-list