Re: itertools.groupby

2013-04-22 Thread Chris Angelico
On Tue, Apr 23, 2013 at 12:49 AM, Oscar Benjamin wrote: > Iterators are > typically preferred over list slicing for sequential text file access > since you can avoid loading the whole file at once. This means that > you can process a large file while only using a constant amount of > memory. And,

Re: itertools.groupby

2013-04-22 Thread Neil Cerutti
On 2013-04-22, Oscar Benjamin wrote: > On 22 April 2013 15:24, Neil Cerutti wrote: >> >> Hrmmm, hoomm. Nobody cares for slicing any more. >> >> def headered_groups(lst, header): >> b = lst.index(header) + 1 >> while True: >> try: >> e = lst.index(header, b) >>

Re: itertools.groupby

2013-04-22 Thread Oscar Benjamin
On 22 April 2013 15:24, Neil Cerutti wrote: > > Hrmmm, hoomm. Nobody cares for slicing any more. > > def headered_groups(lst, header): > b = lst.index(header) + 1 > while True: > try: > e = lst.index(header, b) > except ValueError: > yield lst[b:] >

Re: itertools.groupby

2013-04-22 Thread Neil Cerutti
On 2013-04-20, Jason Friedman wrote: > I have a file such as: > > $ cat my_data > Starting a new group > a > b > c > Starting a new group > 1 > 2 > 3 > 4 > Starting a new group > X > Y > Z > Starting a new group > > I am wanting a list of lists: > ['a', 'b', 'c'] > ['1', '2', '3', '4'] > ['X', 'Y'

Re: itertools.groupby

2013-04-22 Thread Wolfgang Maier
Jason Friedman gmail.com> writes: > > Thank you for the responses!  Not sure yet which one I will pick. > Hi again, I was working a bit on my own solution and on the one from Steven/Joshua, and maybe that helps you deciding: def separate_on(iterable, separator): # based on groupby sep

Re: itertools.groupby

2013-04-21 Thread Joshua Landau
On 21 April 2013 01:13, Steven D'Aprano < steve+comp.lang.pyt...@pearwood.info> wrote: > I wouldn't use groupby. It's a hammer, not every grouping job is a nail. > > Instead, use a simple accumulator: > > > def group(lines): > accum = [] > for line in lines: > line = line.strip() >

Re: itertools.groupby

2013-04-21 Thread Jason Friedman
#!/usr/bin/python3 > from itertools import groupby > > def get_lines_from_file(file_name): > with open(file_name) as reader: > for line in reader.readlines(): > yield(line.strip()) > > counter = 0 > def key_func(x): > if x.startswith("Starting a new group"): > g

Re: itertools.groupby

2013-04-21 Thread Peter Otten
Jason Friedman wrote: > I have a file such as: > > $ cat my_data > Starting a new group > a > b > c > Starting a new group > 1 > 2 > 3 > 4 > Starting a new group > X > Y > Z > Starting a new group > > I am wanting a list of lists: > ['a', 'b', 'c'] > ['1', '2', '3', '4'] > ['X', 'Y', 'Z'] > [] >

Re: itertools.groupby

2013-04-20 Thread Steven D'Aprano
On Sat, 20 Apr 2013 11:09:42 -0600, Jason Friedman wrote: > I have a file such as: > > $ cat my_data > Starting a new group > a > b > c > Starting a new group > 1 > 2 > 3 > 4 > Starting a new group > X > Y > Z > Starting a new group > > I am wanting a list of lists: > ['a', 'b', 'c'] > ['1', '2'

Re: itertools.groupby

2013-04-20 Thread Wolfgang Maier
Jason Friedman gmail.com> writes: > > I have a file such as: > > $ cat my_data  > Starting a new group > > a > b > c > Starting a new group > 1 > 2 > 3 > > 4 > Starting a new group > X > Y > Z > Starting a new group > > > I am wanting a list of lists: > ['a', 'b', 'c'] > > ['1', '2', '3',

Re: itertools.groupby

2013-04-20 Thread Ned Batchelder
On 4/20/2013 1:09 PM, Jason Friedman wrote: I have a file such as: $ cat my_data Starting a new group a b c Starting a new group 1 2 3 4 Starting a new group X Y Z Starting a new group I am wanting a list of lists: ['a', 'b', 'c'] ['1', '2', '3', '4'] ['X', 'Y', 'Z'] [] I wrote this: -

Re: itertools.groupby usage to get structured data

2011-02-07 Thread nn
On Feb 5, 7:12 am, Peter Otten <__pete...@web.de> wrote: > Slafs wrote: > > Hi there! > > > I'm having trouble to wrap my brain around this kind of problem: > > > What I have : > >   1) list of dicts > >   2) list of keys that i would like to be my grouping arguments of > > elements from 1) > >   3

Re: itertools.groupby usage to get structured data

2011-02-05 Thread Peter Otten
Slafs wrote: > Hi there! > > I'm having trouble to wrap my brain around this kind of problem: > > What I have : > 1) list of dicts > 2) list of keys that i would like to be my grouping arguments of > elements from 1) > 3) list of keys that i would like do "aggregation" on the elements > of

Re: itertools.groupby usage to get structured data

2011-02-05 Thread Slafs
On 5 Lut, 05:58, Paul Rubin wrote: > Slafs writes: > > What i want to have is: > > a "big" nested dictionary with 'g1' values as 1st level keys and a > > dictionary of aggregates and "subgroups" in it > > > I was looking for a solution that would let me do that kind of > > grouping with varia

Re: itertools.groupby usage to get structured data

2011-02-04 Thread Paul Rubin
Slafs writes: > What i want to have is: > a "big" nested dictionary with 'g1' values as 1st level keys and a > dictionary of aggregates and "subgroups" in it > > I was looking for a solution that would let me do that kind of > grouping with variable lists of 2) and 3) i.e. having also 'g3' as

Re: itertools.groupby usage to get structured data

2011-02-04 Thread Steven D'Aprano
On Fri, 04 Feb 2011 15:14:24 -0800, Slafs wrote: > Hi there! > > I'm having trouble to wrap my brain around this kind of problem: Perhaps you should consider backing up and staring from somewhere else with different input data, or changing the requirements. Just a thought. > What I have : >

Re: itertools.groupby

2008-01-16 Thread Tobiah
Paul Rubin wrote: > Tobiah <[EMAIL PROTECTED]> writes: >> I tried doing this with a simple example, but noticed >> that [].sort(func) passes two arguments to func, whereas >> the function expected by groupby() uses only one argument. > > Use: [].sort(key=func) Oh cool. Thanks. Only in 2.4+ it s

Re: itertools.groupby

2008-01-16 Thread Paul Rubin
Tobiah <[EMAIL PROTECTED]> writes: > I tried doing this with a simple example, but noticed > that [].sort(func) passes two arguments to func, whereas > the function expected by groupby() uses only one argument. Use: [].sort(key=func) -- http://mail.python.org/mailman/listinfo/python-list

Re: itertools.groupby

2007-06-05 Thread BJörn Lindqvist
On 27 May 2007 10:49:06 -0700, 7stud <[EMAIL PROTECTED]> wrote: > On May 27, 11:28 am, Steve Howell <[EMAIL PROTECTED]> wrote: > > The groupby method has its uses, but it's behavior is > > going to be very surprising to anybody that has used > > the "group by" syntax of SQL, because Python's groupb

Re: itertools.groupby

2007-06-05 Thread [EMAIL PROTECTED]
[EMAIL PROTECTED] wrote: > On May 27, 7:50 pm, Raymond Hettinger <[EMAIL PROTECTED]> wrote: > > The groupby itertool came-out in Py2.4 and has had remarkable > > success (people seem to get what it does and like using it, and > > there have been no bug reports or reports of usability problems). > >

Re: itertools.groupby

2007-06-05 Thread [EMAIL PROTECTED]
On May 27, 7:50 pm, Raymond Hettinger <[EMAIL PROTECTED]> wrote: > The groupby itertool came-out in Py2.4 and has had remarkable > success (people seem to get what it does and like using it, and > there have been no bug reports or reports of usability problems). With due respect, I disagree. Bug

Re: itertools.groupby (gauntlet thrown down)

2007-05-29 Thread Steve Howell
> Raymond Hettinger <[EMAIL PROTECTED]> writes: > > The gauntlet has been thrown down. Any creative > thinkers > > up to the challenge? Give me cool recipes. > Twin primes? (Sorry, no code, but there's a good Python example somewhere that returns an iterator that keeps doing the sieve, feed it

Re: itertools.groupby

2007-05-29 Thread Steve Howell
On May 29, 2:34 am, Raymond Hettinger <[EMAIL PROTECTED]> wrote: > The gauntlet has been thrown down. Any creative > thinkers > up to the challenge? Give me cool recipes. > I don't make any claims to coolness, but I can say that I myself would have written the code below with significantly more

Re: itertools.groupby

2007-05-29 Thread Paul Rubin
Raymond Hettinger <[EMAIL PROTECTED]> writes: > The gauntlet has been thrown down. Any creative thinkers > up to the challenge? Give me cool recipes. Here is my version (with different semantics) of the grouper recipe in the existing recipe section: snd = operator.itemgetter(1) # I use thi

Re: itertools.groupby

2007-05-29 Thread George Sakkis
On May 29, 2:34 am, Raymond Hettinger <[EMAIL PROTECTED]> wrote: > If the posters on this thread have developed an interest > in the subject, I would find it useful to hear their > ideas on new and creative ways to use groupby(). The > analogy to UNIX's uniq filter was found only after the > desi

Re: itertools.groupby

2007-05-29 Thread Steve Howell
--- Carsten Haese <[EMAIL PROTECTED]> wrote: > As an aside, while groupby() will indeed often be > used in conjunction > with sorted(), there is a significant class of use > cases where that's > not the case: I use groupby to produce grouped > reports from the results > of an SQL query. In such ca

Re: itertools.groupby

2007-05-29 Thread Carsten Haese
On Mon, 2007-05-28 at 23:34 -0700, Raymond Hettinger wrote: > On May 28, 8:36 pm, "Carsten Haese" <[EMAIL PROTECTED]> wrote: > > And while > > we're at it, it probably should be keyfunc(value), not key(value). > > No dice. The itertools.groupby() function is typically used > in conjunction with s

Re: itertools.groupby

2007-05-29 Thread Raymond Hettinger
On May 28, 8:02 pm, Gordon Airporte <[EMAIL PROTECTED]> wrote: > "Each" seems to imply uniqueness here. Doh! This sort of micro-massaging the docs misses the big picture. If "each" meant unique across the entire input stream, then how the heck could the function work without reading in the entire

Re: itertools.groupby

2007-05-28 Thread Raymond Hettinger
On May 28, 8:36 pm, "Carsten Haese" <[EMAIL PROTECTED]> wrote: > And while > we're at it, it probably should be keyfunc(value), not key(value). No dice. The itertools.groupby() function is typically used in conjunction with sorted(). It would be a mistake to call it keyfunc in one place and not

Re: itertools.groupby

2007-05-28 Thread Carsten Haese
On Mon, 28 May 2007 23:02:31 -0400, Gordon Airporte wrote > ''' > class groupby(__builtin__.object) > | groupby(iterable[, keyfunc]) -> create an iterator which returns > | (key, sub-iterator) grouped by each value of key(value). > | > ''' > > "Each" seems to imply uniqueness here. Yes, I

Re: itertools.groupby

2007-05-28 Thread Paul Rubin
Gordon Airporte <[EMAIL PROTECTED]> writes: > "itertools.groupby_except_the_notion_of_uniqueness_is_limited_to- > _contiguous_runs_of_elements_having_the_same_key()" doesn't have much > of a ring to it. I guess this gets back to documentation problems, > because the help string says nothing about t

Re: itertools.groupby

2007-05-28 Thread Gordon Airporte
Paul Rubin wrote: > It chops up the iterable into a bunch of smaller ones, but the total > size ends up the same. "Telescope", "compact", "collapse" etc. make > it sound like the output is going to end up smaller than the input. Good point... I guess I was thinking in terms of the number of iter

Re: itertools.groupby

2007-05-28 Thread Paul Rubin
Paul Rubin writes: > >See http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/259173 > But that recipe generates the groups in a random order depending on > the dict hashing, Correction, it generates the right order in this case, although it builds up an in-memo

Re: itertools.groupby

2007-05-28 Thread Paul Rubin
Raymond Hettinger <[EMAIL PROTECTED]> writes: > I think the OP would have been better-off with plain > vanilla Python such as: > >See http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/259173 But that recipe generates the groups in a random order depending on the dict hashing, instead of

Re: itertools.groupby

2007-05-28 Thread Steve Howell
--- Alex Martelli <[EMAIL PROTECTED]> wrote: > Steve Howell <[EMAIL PROTECTED]> wrote: >... > > for has_chars, frags in itertools.groupby(lines, > > lambda x: len(x) > 0): > > Hmmm, it appears to me that itertools.groupby(lines, > bool) should do > just the same job, just a bit faster and si

Re: itertools.groupby

2007-05-28 Thread Steve Howell
--- Raymond Hettinger <[EMAIL PROTECTED]> wrote: > > > That's not for everyone, so it isn't a loss if > > > someone sticks > > > with writing plain, clear everyday Python > instead of > > > an itertool. > > > > I know most of the module is fairly advanced, and > that > > average users can mostly

Re: itertools.groupby

2007-05-28 Thread Steve Howell
--- Paul Rubin <"http://phr.cx"@NOSPAM.invalid> wrote: > > > But that is what groupby does, except its notion of > uniqueness is > limited to contiguous runs of elements having the > same key. It occurred to me that we could also rename the function uniq(), or unique(), after its Unix counterpa

Re: itertools.groupby

2007-05-28 Thread Paul Rubin
Gordon Airporte <[EMAIL PROTECTED]> writes: > This is my first exposure to this function, and I see that it does > have some uses in my code. I agree that it is confusing, however. > IMO the confusion could be lessened if the function with the current > behavior were renamed 'telescope' or 'compact

Re: itertools.groupby

2007-05-28 Thread Gordon Airporte
7stud wrote: > Bejeezus. The description of groupby in the docs is a poster child > for why the docs need user comments. Can someone explain to me in > what sense the name 'uniquekeys' is used this example: > This is my first exposure to this function, and I see that it does have some uses in

Re: itertools.groupby

2007-05-28 Thread Alex Martelli
Steve Howell <[EMAIL PROTECTED]> wrote: ... > for has_chars, frags in itertools.groupby(lines, > lambda x: len(x) > 0): Hmmm, it appears to me that itertools.groupby(lines, bool) should do just the same job, just a bit faster and simpler, no? Alex -- http://mail.python.org/mailman/listinfo/p

Re: itertools.groupby

2007-05-28 Thread Raymond Hettinger
> > That's not for everyone, so it isn't a loss if > > someone sticks > > with writing plain, clear everyday Python instead of > > an itertool. > > I know most of the module is fairly advanced, and that > average users can mostly avoid it, but this is a very > common-antipattern that groupby() solv

Re: itertools.groupby

2007-05-28 Thread Steve Howell
--- Raymond Hettinger <[EMAIL PROTECTED]> wrote: > That's not for everyone, so it isn't a loss if > someone sticks > with writing plain, clear everyday Python instead of > an itertool. > I know most of the module is fairly advanced, and that average users can mostly avoid it, but this is a very

Re: itertools.groupby

2007-05-28 Thread Steve Howell
--- Carsten Haese <[EMAIL PROTECTED]> wrote: > On Sun, 2007-05-27 at 18:12 -0700, Steve Howell > wrote: > > [...] there is no way > > that "uniquekeys" is a sensible variable [...] > > That's because the OP didn't heed the advice from > the docs that > "Generally, the iterable needs to already b

Re: itertools.groupby

2007-05-28 Thread Raymond Hettinger
On May 28, 8:34 am, 7stud <[EMAIL PROTECTED]> wrote: > >- there are two more examples on the next page. those two > > examples also give sample inputs and outputs. > > I didn't see those. Ah, there's the rub. The two sections of examples and recipes are there for a reason. This isn't a beginne

Re: itertools.groupby

2007-05-28 Thread 7stud
On May 27, 6:50 pm, Raymond Hettinger <[EMAIL PROTECTED]> wrote: > On May 27, 2:59 pm, Steve Howell <[EMAIL PROTECTED]> wrote: > > > These docs need work. Please do not defend them; > > please suggest improvements. > > FWIW, I wrote those docs. Suggested improvements are > welcome; however, I thi

Re: itertools.groupby

2007-05-28 Thread Steve Howell
--- Paul Rubin <"http://phr.cx"@NOSPAM.invalid> wrote: > [...] > Here's yet another example that came up in something > I was working on: > you are indexing a book and you want to print a list > of page numbers > for pages that refer to George Washington. If > Washington occurs on > several conse

Re: itertools.groupby

2007-05-28 Thread Steve Howell
--- Raymond Hettinger <[EMAIL PROTECTED]> wrote: > + The operation of \function{groupby()} is similar > to the \code{uniq} > filter > + in \UNIX{}. [...] Thanks! The comparison of groupby() to "uniq" really clicks with me. To the extent that others like the Unix command line analogy for u

Re: itertools.groupby

2007-05-28 Thread Carsten Haese
On Sun, 2007-05-27 at 20:28 -0700, Paul Rubin wrote: >fst = operator.itemgetter(0) >snd = operator.itemgetter(1) > >def bates(fd): > # generate tuples (n,d) of lines from file fd, > # where n is the record number. Just iterate through all lines > # of the file, stamping

Re: itertools.groupby

2007-05-28 Thread Carsten Haese
On Sun, 2007-05-27 at 18:12 -0700, Steve Howell wrote: > [...] there is no way > that "uniquekeys" is a sensible variable [...] That's because the OP didn't heed the advice from the docs that "Generally, the iterable needs to already be sorted on the same key function." > http://informixdb.blogsp

Re: itertools.groupby

2007-05-27 Thread Paul Rubin
Raymond Hettinger <[EMAIL PROTECTED]> writes: > On May 27, 8:28 pm, Paul Rubin wrote: > > I use the module all the time now and it is great. > Thanks for the accolades and the great example. Thank YOU for the great module ;). Feel free to use the example in the docs i

Re: itertools.groupby

2007-05-27 Thread Raymond Hettinger
On May 27, 8:28 pm, Paul Rubin wrote: > I use the module all the time now and it is great. Thanks for the accolades and the great example. FWIW, I checked in a minor update to the docs: +++ python/trunk/Doc/lib/libitertools.tex Mon May 28 07:23:22 2007 @@ -138,6 +13

Re: itertools.groupby

2007-05-27 Thread Paul Rubin
Raymond Hettinger <[EMAIL PROTECTED]> writes: > The groupby itertool came-out in Py2.4 and has had remarkable > success (people seem to get what it does and like using it, and > there have been no bug reports or reports of usability problems). > All in all, that ain't bad (for what 7stud calls a po

Re: itertools.groupby

2007-05-27 Thread Steve Howell
--- Raymond Hettinger <[EMAIL PROTECTED]> wrote: > > FWIW, I wrote those docs. Suggested improvements > are > welcome; however, I think they already meet a > somewhat > high standard of quality: > I respectfully disagree, and I have suggested improvements in this thread. Without even reading

Re: itertools.groupby

2007-05-27 Thread Steve Howell
--- Carsten Haese <[EMAIL PROTECTED]> wrote: > [...] It's an abstract code pattern for an abstract use > case. I question the use of abstract code patterns in documentation, as they just lead to confusion. I really think concrete examples are better in any circumstance. Also, to the OP's ori

Re: itertools.groupby

2007-05-27 Thread Raymond Hettinger
On May 27, 2:59 pm, Steve Howell <[EMAIL PROTECTED]> wrote: > These docs need work. Please do not defend them; > please suggest improvements. FWIW, I wrote those docs. Suggested improvements are welcome; however, I think they already meet a somewhat high standard of quality: - there is an accur

Re: itertools.groupby

2007-05-27 Thread Carsten Haese
On Sun, 2007-05-27 at 14:59 -0700, Steve Howell wrote: > Huh? How is code that uses itertools.groupby not an > actual example of using itertools.groupby? Here's how: """ The returned group is itself an iterator that shares the underlying iterable with groupby(). Because the source is shared, whe

Re: itertools.groupby

2007-05-27 Thread Steve Howell
--- Carsten Haese <[EMAIL PROTECTED]> wrote: > On Sun, 2007-05-27 at 10:17 -0700, 7stud wrote: > > Bejeezus. The description of groupby in the docs > is a poster child > > for why the docs need user comments. Can someone > explain to me in > > what sense the name 'uniquekeys' is used this > exa

Re: itertools.groupby

2007-05-27 Thread Steve Howell
--- paul <[EMAIL PROTECTED]> wrote: > > > > Regarding the pitfalls of groupby in general (even > > assuming we had better documentation), I invite > people > > to view the following posting that I made on > > python-ideas, entitled "SQL-like way to manipulate > > Python data structures": > > >

Re: itertools.groupby

2007-05-27 Thread paul
Steve Howell schrieb: > --- Steve Howell <[EMAIL PROTECTED]> wrote: > >> --- 7stud <[EMAIL PROTECTED]> wrote: >> >>> Bejeezus. The description of groupby in the docs >> is >>> a poster child >>> for why the docs need user comments. > > Regarding the pitfalls of groupby in general (even > assum

Re: itertools.groupby

2007-05-27 Thread Carsten Haese
On Sun, 2007-05-27 at 10:17 -0700, 7stud wrote: > Bejeezus. The description of groupby in the docs is a poster child > for why the docs need user comments. Can someone explain to me in > what sense the name 'uniquekeys' is used this example: > > > import itertools > > mylist = ['a', 1, 'b', 2,

Re: itertools.groupby

2007-05-27 Thread Steve Howell
--- Steve Howell <[EMAIL PROTECTED]> wrote: > > --- 7stud <[EMAIL PROTECTED]> wrote: > > > Bejeezus. The description of groupby in the docs > is > > a poster child > > for why the docs need user comments. > Regarding the pitfalls of groupby in general (even assuming we had better documenta

Re: itertools.groupby

2007-05-27 Thread Steve Howell
--- 7stud <[EMAIL PROTECTED]> wrote: > > I'd settle for a simple explanation of what it does > in python. > The groupby function prevents you have from having to write awkward (and possibly broken) code like this: group = [] lastKey = None for item in items: newKey = item.k

Re: itertools.groupby

2007-05-27 Thread Steve Howell
--- 7stud <[EMAIL PROTECTED]> wrote: > Bejeezus. The description of groupby in the docs is > a poster child > for why the docs need user comments. I would suggest an example with a little more concreteness than what's currently there. For example, this code... import itertools syslog_messa

Re: itertools.groupby

2007-05-27 Thread 7stud
On May 27, 11:28 am, Steve Howell <[EMAIL PROTECTED]> wrote: > --- 7stud <[EMAIL PROTECTED]> wrote: > > Bejeezus. The description of groupby in the docs is > > a poster child > > for why the docs need user comments. Can someone > > explain to me in > > what sense the name 'uniquekeys' is used thi

Re: itertools.groupby

2007-05-27 Thread Steve Howell
--- 7stud <[EMAIL PROTECTED]> wrote: > Bejeezus. The description of groupby in the docs is > a poster child > for why the docs need user comments. Can someone > explain to me in > what sense the name 'uniquekeys' is used this > example: [...] > The groupby method has its uses, but it's behavi