Steven D'Aprano <st...@remove-this-cybersource.com.au> writes: > Suppose I have an iterator that yields tuples of N items (a, b, ... n). > > I want to split this into N independent iterators: > > iter1 -> a, a2, a3, ... > iter2 -> b, b2, b3, ... > ... > iterN -> n, n2, n3, ... > > The iterator may be infinite, or at least too big to collect in a list. > > My first attempt was this: > > > def split(iterable, n): > iterators = [] > for i, iterator in enumerate(itertools.tee(iterable, n)): > iterators.append((t[i] for t in iterator)) > return tuple(iterators) > > But it doesn't work, as all the iterators see the same values: > >>>> data = [(1,2,3), (4,5,6), (7,8,9)] >>>> a, b, c = split(data, 3) >>>> list(a), list(b), list(c) > ([3, 6, 9], [3, 6, 9], [3, 6, 9]) > > > I tried changing the t[i] to use operator.itergetter instead, but no > luck. Finally I got this: > > def split(iterable, n): > iterators = [] > for i, iterator in enumerate(itertools.tee(iterable, n)): > f = lambda it, i=i: (t[i] for t in it) > iterators.append(f(iterator)) > return tuple(iterators) > > which seems to work: > >>>> data = [(1,2,3), (4,5,6), (7,8,9)] >>>> a, b, c = split(data, 3) >>>> list(a), list(b), list(c) > ([1, 4, 7], [2, 5, 8], [3, 6, 9]) > > > > > Is this the right approach, or have I missed something obvious?
It is quite straightforward to implement your "split" function without itertools.tee: from collections import deque def split(iterable): it = iter(iterable) q = [deque([x]) for x in it.next()] def proj(qi): while True: if not qi: for qj, xj in zip(q, it.next()): qj.append(xj) yield qi.popleft() for qi in q: yield proj(qi) >>> data = [(1,2,3), (4,5,6), (7,8,9)] >>> a, b, c = split(data) >>> print list(a), list(b), list(c) [1, 4, 7] [2, 5, 8] [3, 6, 9] Interestingly, given "split" it is very easy to implement "tee": def tee(iterable, n=2): return split(([x]*n for x in iterable)) >>> a, b = tee(range(10), 2) >>> a.next(), a.next(), b.next() (0, 1, 0) >>> a.next(), a.next(), b.next() (2, 3, 1) In fact, split(x) is the same as zip(*x) when x is finite. The difference is that with split(x), x is allowed to be infinite and with zip(*x), each term of x is allowed to be infinite. It may be good to have a function unifying the two. -- Arnaud -- http://mail.python.org/mailman/listinfo/python-list