On Jan 18, 11:15 am, David Sanders <[EMAIL PROTECTED]> wrote: > Hi, > > I am processing large files of numerical data. Each line is either a > single (positive) integer, or a pair of positive integers, where the > second represents the number of times that the first number is > repeated in the data -- this is to avoid generating huge raw files, > since one particular number is often repeated in the data generation > step. > > My question is how to process such files efficiently to obtain a > frequency histogram of the data (how many times each number occurs in > the data, taking into account the repetitions). My current code is as > follows:
Many thanks to all for the very detailed and helpful replies. I'm glad to see I was on the right track, but more happy to have learnt some different approaches. Thanks and best wishes, David. -- http://mail.python.org/mailman/listinfo/python-list