On Sat, Jun 5, 2010 at 6:20 PM, GZ <zyzhu2...@gmail.com> wrote: > Hi, > > I am looking for a fast internal vector representation so that > (a1,b2,c1)+(a2,b2,c2)=(a1+a2,b1+b2,c1+c2). > > So I have a list > > l = ['a'a,'bb','ca','de'...] > > I want to count all items that start with an 'a', 'b', and 'c'. > > What I can do is: > > count_a = sum(int(x[1]=='a') for x in l) > count_b = sum(int(x[1]=='b') for x in l) > count_c = sum(int(x[1]=='c') for x in l) > > But this loops through the list three times, which can be slow.
I don't really get how that relates to vectors or why you'd use that representation, and it looks like you're forgotten that Python uses 0-based indexing, but anyway, here's my crack at something more efficient: from collections import defaultdict cared_about = set('abc') letter2count = defaultdict(int) for item in l: initial = item[0] if initial in cared_about: letter2count[initial] += 1 count_a = letter2count['a'] count_b = letter2count['b'] count_c = letter2count['c'] Cheers, Chris -- http://blog.rebertia.com -- http://mail.python.org/mailman/listinfo/python-list