On Sunday, November 11, 2012 1:54:46 AM UTC-8, Peter Otten wrote: > Paul Rubin wrote: > > > > > Cameron Simpson <c...@zip.com.au> writes: > > >> | I'd prefer the original code ten times over this inaccessible beast. > > >> Me too. > > > > > > Me, I like the itertools version better. There's one chunk of data > > > that goes through a succession of transforms each of which > > > is very straightforward. > > > > [Steve Howell] > > > def get_tweets(term, get_page): > > > page_nums = itertools.count(1) > > > pages = itertools.imap(api.getSearch, page_nums) > > > valid_pages = itertools.takewhile(bool, pages) > > > tweets = itertools.chain.from_iterable(valid_pages) > > > return tweets > > > > > > But did you spot the bug(s)? >
My first version was sketching out the technique, and I don't have handy access to the API. Here is an improved version: def get_tweets(term): def get_page(page): return getSearch(term, page) page_nums = itertools.count(1) pages = itertools.imap(get_page, page_nums) valid_pages = itertools.takewhile(bool, pages) tweets = itertools.chain.from_iterable(valid_pages) return tweets for tweet in get_tweets("foo"): process(tweet) This is what I used to test it: def getSearch(term = "foo", page = 1): # simulate api for testing if page < 5: return [ 'page %d, tweet A for term %s' % (page, term), 'page %d, tweet B for term %s' % (page, term), ] else: return None def process(tweet): print tweet -- http://mail.python.org/mailman/listinfo/python-list