I've got several iterators sharing a common key in the same order and would like to iterate over them in parallel, operating on all items with the same key. I've simplified the data a bit here, but it would be something like
data1 = [ # key, data1 (1, "one A"), (1, "one B"), (2, "two"), (5, "five"), ] data2 = [ # key, data1 (1, "uno"), (2, "dos"), (3, "tres x"), (3, "tres y"), (3, "tres z"), (4, "cuatro"), ] data3 = [ # key, data1, data2 (2, "ii", "extra alpha"), (4, "iv", "extra beta"), (5, "v", "extra gamma"), ] And I'd like to do something like for common_key, d1, d2, d3 in magic_happens_here(data1, data2, data3): for row in d1: process_a(common_key, row) for thing in d2: process_b(common_key, row) for thing in d3: process_c(common_key, row) which would yield the common_key, along with enough of each of those iterators (note that gaps can happen, but the sortable order should remain the same). So in the above data, the outer FOR loop would happen 5 times with common_key being [1, 2, 3, 4, 5], and each of [d1, d2, d3] being an iterator that deals with just that data. My original method was hauling everything into memory and making multiple passes filtering on the data. However, the actual sources are CSV-files, some of which are hundreds of megs in size, and my system was taking a bit of a hit. So I was hoping for a way to do this with each iterator making only one complete pass through each source (since they're sorted by common key). It's somewhat similar to the *nix "join" command, only dealing with N files. Thanks for any hints. -tkc -- https://mail.python.org/mailman/listinfo/python-list