On 11/17/2010 6:10 PM, Steve Holden wrote:
$ cat data.py lines = open("data.txt").readlines()
Since you iterate through the file just once, there is no reason I can think of to make a complete in-memory copy. That would be a problem with a multi-gigabyte log file ;=). In 3.x at least, open files are line iterators and one would just need
lines = open("data.txt")
from collections import defaultdict c = defaultdict(int) for line in lines: ls = line.split() if len(ls)> 3 and ls[3].startswith("NCPU="): amt = int(ls[3][5:]) c[ls[0]] += amt for key, value in c.items(): print key, ":", value $ python data.py xyz : 4 tanhoi : 1 sabril : 6 regards Steve
-- Terry Jan Reedy -- http://mail.python.org/mailman/listinfo/python-list