Re: getting n items at a time from a generator
On Dec 27, 7:07 pm, Paul Hankin <[EMAIL PROTECTED]> wrote: > On Dec 27, 11:34 am, Kugutsumen <[EMAIL PROTECTED]> wrote: > > > > > I am relatively new the python language and I am afraid to be missing > > some clever construct or built-in way equivalent to my 'chunk' > > generator below. > > > def chunk(size, items): > > """generate N items from a generator.""" > > chunk = [] > > count = 0 > > while True: > > try: > > item = items.next() > > count += 1 > > except StopIteration: > > yield chunk > > break > > chunk.append(item) > > if not (count % size): > > yield chunk > > chunk = [] > > count = 0 > > The itertools module is always a good place to look when you've got a > complicated generator. > > import itertools > import operator > > def chunk(N, items): > "Group items in chunks of N" > def clump((n, _)): > return n // N > for _, group in itertools.groupby(enumerate(items), clump): > yield itertools.imap(operator.itemgetter(1), group) > > for ch in chunk(7, range(30)): > print list(ch) > > I've changed chunk to return a generator rather than building a list > which is probably only going to be iterated over. But if you prefer > the list version, replace 'itertools.imap' with 'map'. > > -- > Paul Hankin Thanks, I am going to take a look at itertools. I prefer the list version since I need to buffer that chunk in memory at this point. -- http://mail.python.org/mailman/listinfo/python-list
Re: getting n items at a time from a generator
On Dec 27, 7:24 pm, Terry Jones <[EMAIL PROTECTED]> wrote: > >>>>> "Kugutsumen" == Kugutsumen <[EMAIL PROTECTED]> writes: > > Kugutsumen> On Dec 27, 7:07 pm, Paul Hankin <[EMAIL PROTECTED]> wrote: > > >> On Dec 27, 11:34 am, Kugutsumen <[EMAIL PROTECTED]> wrote: > > >> > I am relatively new the python language and I am afraid to be missing > >> > some clever construct or built-in way equivalent to my 'chunk' > >> > generator below. > > Kugutsumen> Thanks, I am going to take a look at itertools. I prefer the > Kugutsumen> list version since I need to buffer that chunk in memory at > Kugutsumen> this point. > > Also consider this solution from O'Reilly's Python Cookbook (2nd Ed.) p705 > > def chop(iterable, length=2): > return izip(*(iter(iterable),) * length) > > Terry Thanks Terry, However, chop ignores the remainder of the data in the example. >>> t = (i for i in range(30)) >>> c =chop (t, 7) >>> for ch in c: ... print ch ... (0, 1, 2, 3, 4, 5, 6) (7, 8, 9, 10, 11, 12, 13) (14, 15, 16, 17, 18, 19, 20) (21, 22, 23, 24, 25, 26, 27) k -- http://mail.python.org/mailman/listinfo/python-list
getting n items at a time from a generator
I am relatively new the python language and I am afraid to be missing some clever construct or built-in way equivalent to my 'chunk' generator below. def chunk(size, items): """generate N items from a generator.""" chunk = [] count = 0 while True: try: item = items.next() count += 1 except StopIteration: yield chunk break chunk.append(item) if not (count % size): yield chunk chunk = [] count = 0 >>> t = (i for i in range(30)) >>> c = chunk(7, t) >>> for i in c: ... print i ... [0, 1, 2, 3, 4, 5, 6] [7, 8, 9, 10, 11, 12, 13] [14, 15, 16, 17, 18, 19, 20] [21, 22, 23, 24, 25, 26, 27] [28, 29] In my real world project, I have over 250 million items that are too big to fit in memory and that processed and later used to update records in a database... to minimize disk IO, I found it was more efficient to process them by batch or "chunk" of 50,000 or so. Hence Is this the proper way to do this? -- http://mail.python.org/mailman/listinfo/python-list
Re: getting n items at a time from a generator
On Dec 27, 7:24 pm, Terry Jones <[EMAIL PROTECTED]> wrote: > >>>>> "Kugutsumen" == Kugutsumen <[EMAIL PROTECTED]> writes: > > Kugutsumen> On Dec 27, 7:07 pm, Paul Hankin <[EMAIL PROTECTED]> wrote: > > >> On Dec 27, 11:34 am, Kugutsumen <[EMAIL PROTECTED]> wrote: > > >> > I am relatively new the python language and I am afraid to be missing > >> > some clever construct or built-in way equivalent to my 'chunk' > >> > generator below. > > Kugutsumen> Thanks, I am going to take a look at itertools. I prefer the > Kugutsumen> list version since I need to buffer that chunk in memory at > Kugutsumen> this point. > > Also consider this solution from O'Reilly's Python Cookbook (2nd Ed.) p705 > > def chop(iterable, length=2): > return izip(*(iter(iterable),) * length) > > Terry > [snip code] > > Try this instead: > > import itertools > > def chunk(iterator, size): > # I prefer the argument order to be the reverse of yours. > while True: > chunk = list(itertools.islice(iterator, size)) > if chunk: yield chunk > else: break > Steven, I really like your version since I've managed to understand it in one pass. Paul's version works but is too obscure to read for me :) Thanks a lot again. -- http://mail.python.org/mailman/listinfo/python-list
Looking for Python Developers in Jakarta, Indonesia
Responsibilities: You will be part of a team responsible for implementing and supporting a Google App Engine based application. Requirements: Professional: University bachelor's degree in computer science or related discipline (web development, design, relevant computer languages and software applications) Python - Experience working in a web framework: Django or GAE (Google App Engine) preferred. Source Control Experience (Branching, merging, conflict resolution). - Knowledge and experience with unit testing and testing concepts. Other Skills: * Able to work in a fast paced environment, and quickly produce deliverables. * Able to communicate effectively with both technical and non- technical staff. * Able to work independently with minimal supervision. If you have a blog, bitbucket, github or google code repository/ opensource or other interesting projects that you feel may be relevant to you application, please provide links for review. We thank all candidates for their interest however only those selected for an interview will be contacted. Contact: kugutsumen+j...@gmail.com -- http://mail.python.org/mailman/listinfo/python-list