On Nov 11, 10:34 am, Peter Otten <__pete...@web.de> wrote: > Steve Howell wrote: > > On Nov 11, 1:09 am, Paul Rubin <no.em...@nospam.invalid> wrote: > >> Cameron Simpson <c...@zip.com.au> writes: > >> > | I'd prefer the original code ten times over this inaccessible beast. > >> > Me too. > > >> Me, I like the itertools version better. There's one chunk of data > >> that goes through a succession of transforms each of which > >> is very straightforward. > > > Thanks, Paul. > > > Even though I supplied the "inaccessible" itertools version, I can > > understand why folks find it inaccessible. As I said to the OP, there > > was nothing wrong with the original imperative approach; I was simply > > providing an alternative. > > > It took me a while to appreciate itertools, but the metaphor that > > resonates with me is a Unix pipeline. It's just a metaphor, so folks > > shouldn't be too literal, but the idea here is this: > > > page_nums -> pages -> valid_pages -> tweets > > > The transforms are this: > > > page_nums -> pages: call API via imap > > pages -> valid_pages: take while true > > valid_pages -> tweets: use chain.from_iterable to flatten results > > > Here's the code again for context: > > > def get_tweets(term): > > def get_page(page): > > return getSearch(term, page) > > page_nums = itertools.count(1) > > pages = itertools.imap(get_page, page_nums) > > valid_pages = itertools.takewhile(bool, pages) > > tweets = itertools.chain.from_iterable(valid_pages) > > return tweets > > Actually you supplied the "accessible" itertools version. For reference, > here's the inaccessible version: > > class api: > """Twitter search API mock-up""" > pages = [ > ["a", "b", "c"], > ["d", "e"], > ] > @staticmethod > def GetSearch(term, page): > assert term == "foo" > assert page >= 1 > if page > len(api.pages): > return [] > return api.pages[page-1] > > from collections import deque > from functools import partial > from itertools import chain, count, imap, takewhile > > def process(tweet): > print tweet > > term = "foo" > > deque( > imap( > process, > chain.from_iterable( > takewhile(bool, imap(partial(api.GetSearch, term), count(1))))), > maxlen=0) > > ;)
I know Peter's version is tongue in cheek, but I do think that it has a certain expressive power, and it highlights three mind-expanding Python modules. Here's a re-flattened take on Peter's version ("Flat is better than nested." -- PEP 20): term = "foo" search = partial(api.GetSearch, term) nums = count(1) paged_tweets = imap(search, nums) paged_tweets = takewhile(bool, paged_tweets) tweets = chain.from_iterable(paged_tweets) processed_tweets = imap(process, tweets) deque(processed_tweets, maxlen=0) The use of deque to exhaust an iterator is slightly overboard IMHO, but all the other lines of code can be fairly easily understood once you read the docs. partial: http://docs.python.org/2/library/functools.html count, imap, takewhile, chain.from_iterable: http://docs.python.org/2/library/itertools.html deque: http://docs.python.org/2/library/collections.html -- http://mail.python.org/mailman/listinfo/python-list