a pipeline can be described as a sequence of functions that are applied to an input with each subsequent function getting the output of the preceding function:
out = f6(f5(f4(f3(f2(f1(in)))))) However this isn't very readable and does not support conditionals. Tensorflow has tensor-focused pipepines: fc1 = layers.fully_connected(x, 256, activation_fn=tf.nn.relu, scope='fc1') fc2 = layers.fully_connected(fc1, 256, activation_fn=tf.nn.relu, scope='fc2') out = layers.fully_connected(fc2, 10, activation_fn=None, scope='out') I have some code which allows me to mimic this, but with an implied parameter. def executePipeline(steps, collection_funcs = [map, filter, reduce]): results = None for step in steps: func = step[0] params = step[1] if func in collection_funcs: print func, params[0] results = func(functools.partial(params[0], *params[1:]), results) else: print func if results is None: results = func(*params) else: results = func(*(params+(results,))) return results executePipeline( [ (read_rows, (in_file,)), (map, (lower_row, field)), (stash_rows, ('stashed_file', )), (map, (lemmatize_row, field)), (vectorize_rows, (field, min_count,)), (evaluate_rows, (weights, None)), (recombine_rows, ('stashed_file', )), (write_rows, (out_file,)) ] ) Which gets me close, but I can't control where rows gets passed in. In the above code, it is always the last parameter. I feel like I'm reinventing a wheel here. I was wondering if there's already something that exists? -- https://mail.python.org/mailman/listinfo/python-list