While studying iterators and generator expressions, I started wishing I had some tools for processing the values. I wanted to be able to chain together a set of functions, sort of like the "pipelines" you can make with command-line programs.
So, I wrote a module called iterwrap.py. You can download it from here: http://home.blarg.net/~steveha/iterwrap.tar.gz iterwrap has functions that "wrap" an iterator; when you call the .next() method on a wrapped iterator, it will get the .next() value from the original iterator, apply a function to it, and return the new value. Of course, a wrapped iterator is itself an iterator, so you can wrap it again: you can build up a "chain" of wrappers that will do the processing you want. As an example, here's a command-line pipeline: cat mylist | sort | uniq > newlist Here's the analogous example from iterwrap: newlist = list(iterwrap.uniq(iterwrap.sort(mylist))) You need to call list() because all the wrapper functions in iterwrap always return an iterator. That final list() forces the iterator returned by uniq() to be expanded out to a list. iterwrap.py defines functions based on many common command-line tools: sort, uniq, tr, grep, cat, head, tail, and tee. Plus it defines some other functions that seemed like they would be useful. Well, it doesn't take very many nested function calls before the call gets visually confusing, with lots of parentheses at the end. To avoid this, you can arrange the calls in a vertical chain, like this: temp = iterwrap.sort(mylist) temp = iterwrap.uniq(temp) newlist = list(temp) But I wanted to provide a convenience class to allow "dot-chaining". I wanted something like this to work: from iterwrap import * newlist = Pipe(mylist).sort.uniq.list() I have actually coded up two classes. One, Pipe, works as shown above. The other, which I unimaginatively called "IW" (for "iterwrap"), works in a right-to-left order: from iterwrap import * iw = IW() newlist = iw.list.uniq.sort(mylist) Hear now my cry for help: Both IW() and Pipe() have annoying problems. I'd like to have one class that just works. The problem with Pipe() is that it will act very differently depending on whether the user remembers to put the "()" on the end. For all the dot-chained functions in the middle of the chain, you don't need to put parentheses; it will just work. However, for the function at the end of the dot-chain, you really ought to put the parentheses. In the given example, if the user remembers to put the parentheses, mylist will be set to a list; otherwise, mylist will be set to an instance of class Pipe. An instance of class Pipe works as an iterator, so in this example: itr = Pipe(mylist).sort.uniq ...then the user really need not care whether there are parentheses after uniq() or not. Which of course will make it all the more confusing when the list() case breaks. In comparison with Pipe, IW is clean and elegant. The user cannot forget the parenthetical expression on the end, since that's where the initial sequence (list or iterator) is provided! The annoying thing about IW is that the dot-chained functions cannot have extra arguments passed in. This example works correctly: newlist = Pipe(mylist).grep("larch").grep("parrot", "v").list() newlist will be set to a list of all strings from mylist that contain the string "larch" but do not contain the string "parrot". There is no way to do this example with IW, because IW expects just one call to its __call__() function. The best you could do with IW is: temp = iw.grep(mylist, "larch") newlist = iw.list.grep(temp, "parrot", "v") Since it *is* legal to pass extra arguments to the one permitted __call__(), this works, but it's really not very much of an advantage over the vertical chain: temp = grep(mylist, "larch") temp = grep(temp, "parrot", "v") newlist = list(temp) The key point here is that, when processing a dot-chain, my code doesn't actually know whether it's looking at the end of the dot-chain. If you had newlist = Pipe(mylist).foo.bar.baz and if my code could somehow know that baz is the last thing in the chain, it could treat baz specially (and do the right thing whether there are parentheses on it, or not). I wish there were a special method __set__ called when an expression is being assigned somewhere; that would make this trivial. What is the friendliest and most Pythonic way to write a Pipe class for iterwrap? P.S. I have experimented with overloading the | operator to allow this syntax: newlist = Pipe(mylist) | sort | uniq | list() Personally, I much prefer the dot-chaining syntax. The above is just too tricky. -- Steve R. Hastings "Vita est" [EMAIL PROTECTED] http://www.blarg.net/~steveha -- http://mail.python.org/mailman/listinfo/python-list