Gregory Piñero wrote: [top-posting rearranged] > On 12/14/05, Peter Otten <[EMAIL PROTECTED]> wrote: > >>Gregory Piñero wrote: >> >> >>>def add_freqs(freq1,freq2): >>>"""Addtwowordfreqdicts""" >>>newfreq={} >>>forkey,valueinfreq1.items(): >>>newfreq[key]=value+freq2.get(key,0) >>>forkey,valueinfreq2.items(): >>>newfreq[key]=value+freq1.get(key,0) >>>returnnewfreq >> >>>Anyideasondoingthistaskalot faster would be appriciated. >> >>With items() you copy the whole dictionary into a list of tuples; >>iteritems() just walks over the existing dictionary and creates one tuple >>at a time. >> >>With "80% overlap", you are looking up and setting four out of five values >>twice in your for-loops. >> >>Dump the symmetry and try one of these: >> >>def add_freqs2(freq1, freq2): >> total = dict(freq1) >> for key, value in freq2.iteritems(): >> if key in freq1: >> total[key] += value >> else: >> total[key] = value >> return total >> >>def add_freqs3(freq1, freq2): >> total = dict(freq1) >> for key, value in freq2.iteritems(): >> try: >> total[key] += value >> except KeyError: >> total[key] = value >> return total >> >>My guess is that add_freqs3() will perform best. >> > Thanks Peter, those are some really good ideas. I can't wait to try > them out tonight. > > Here's a question about your functions. if I only look at the keys in > freq2 then won't I miss any keys that are in freq1 and not in freq2? > That's why I have the two loops in my original function. > No, because the statement
total = dict(freq1) creates total as a shallow copy of freq1. Thus all that remains to be done is to add the items from freq2. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.python.org/pycon/ -- http://mail.python.org/mailman/listinfo/python-list