Hi all This is probably old hat to most of you, but for me it was a revelation, so I thought I would share it in case someone has a similar requirement.
I had to convert an old program that does a traditional pass through a sorted data file, breaking on a change of certain fields, processing each row, accumulating various totals, and doing additional processing at each break. I am not using a database for this one, as the file sizes are not large - a few thousand rows at most. I am using csv files, and using the csv module so that each row is nicely formatted into a list. The traditional approach is quite fiddly, saving the values of the various break fields, comparing the values on each row with the saved values, and taking action if the values differ. The more break fields there are, the fiddlier it gets. I was going to do the same in python, but then I vaguely remembered reading about 'groupby'. It took a little while to figure it out, but once I had cracked it, it transformed the task into one of utter simplicity. Here is an example. Imagine a transaction file sorted by branch, account number, and date, and you want to break on all three. ----------------------------- import csv from itertools import groupby from operator import itemgetter BRN = 0 ACC = 1 DATE = 2 reader = csv.reader(open('trans.csv', 'rb')) rows = [] for row in reader: rows.append(row) for brn,brnList in groupby(rows,itemgetter(BRN)): for acc,accList in groupby(brnList,itemgetter(ACC)): for date,dateList in groupby(accList,itemgetter(DATE)): for row in dateList: [do something with row] [do something on change of date] [do something on change of acc] [do something on change of brn] ----------------------------- Hope someone finds this of interest. Frank Millman -- http://mail.python.org/mailman/listinfo/python-list