Matt:

> from collections import defaultdict
>
> def get_hist(file_name):
>     hist = defaultdict(int)
>     f = open(filename,"r")
>     for line in f:
>         vals = line.split()
>         val = int(vals[0])
>         try: # don't look to see if you will cause an error,
>              # just cause it and then deal with it
>             cnt = int(vals[1])
>         except IndexError:
>             cnt = 1
>         hist[val] += cnt
>     return hist


But usually in tight loops exceptions slow down the Python code, so
this is quite faster (2.5 times faster with Psyco, about 2 times
without, with  about 30% of lines with a space in it):

import psyco
from collections import defaultdict

def get_hist(file_name):
    hist = defaultdict(int)

    for line in open(file_name):
        if " " in line:
            pair = line.split()
            hist[int(pair[0])] += int(pair[1])
        else:
            hist[int(line)] += 1

    return hist

psyco.bind(get_hist)

It doesn't work if lines may contain spurious spaces...

Bye,
bearophile
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to