Hi.

I need to be able to lazily merge a variable number of already-sorted(*) variable-length sequences into a single sorted sequence. The merge should continue until the longest sequence has been exhausted.

(*) They may in practice be a lazy source of data known to only ever be generated in an order that's "sorted" from the POV of this function.

I have the following - can anyone think of any improvements (other than indentation style on the try/excepts ;) )? This seems like the sort of thing for which there might be a built-in or recipe that I've missed.

-----------------------------------------
def merge(*sources):
    items = []
    for source in (iter(s) for s in sources):
       try: items.append([next(source), source])
       except StopIteration: pass

    while len(items) > 1:
        items.sort(key=lambda item: item[0])
        lowest = items[0]
        yield lowest[0]

        try: lowest[0] = next(lowest[1])
        except StopIteration: del items[0]

    if len(items) != 0:
       yield items[0][0]
       yield from items[0][1]
-----------------------------------------

Test code:

-----------------------------------------
a = range(10)
b = range(4, 50, 3)
c = range(20, 100, 5)

# Greedy version to verify result:
greedy = list(a) + list(b) + list(c)
greedy.sort()

# Test multiple sequences:
assert list(merge(a, b, c)) == greedy

# Test a single sequence:
assert list(merge(greedy)) == list(merge(a, b, c))

# Test no sequences:
assert list(merge()) == []
assert list(merge([], (), range(0))) == []
-----------------------------------------

Thanks, E.
--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to