On 14/06/2006 8:38 AM, Robert Kern wrote: > Gary Herron wrote: >> John Machin wrote: >> >>> On 13/06/2006 6:28 PM, Paul McGuire wrote: >>> >>>> (Oh, and I like groupby too! Combine it with sort to quickly create >>>> histograms.) >>>> >>>> # tally a histogram of a list of values from 1-10 >>>> dataValueRange = range(1,11) >>>> data = [random.choice(dataValueRange) for i in xrange(10000)] >>>> >>>> hist = [ (k,len(list(g))) for k,g in itertools.groupby(sorted(data)) ] >>> That len(list(g)) looks like it uses O(N) memory just to find out what N >>> is :-( >> Not at all! A python list *knows* its length at all times. len() is a >> constant time lookup of an internal attribute. > > The point is that you had to create the list in the first place. g is an > iterator. >
I didn't have to create a list in the first place. Paul did. The point of my post was to avoid the memory grab caused by list(g) by seeking a way that just counted g's output. Sorry for the confusion my lack of clarity has evidently caused. I'll rephrase: That whole construct len(list(g)) looks like it uses O(N) memory just to find out what N is. Better? -- http://mail.python.org/mailman/listinfo/python-list