I'm working on a naive K-nearest-neighbors selection criteria for an optical character recognition problem.
After I build my training set, I test each new image against against the trained feature vectors and record the scores as follows: match_vals = [(match_val_1, identifier_a), (match_val_2, identifier_b) .... ] and so on.. then I sort the list so the smallest match_val's appear first (indictating a strong match, so I may end up with something like this: [(match_val_291, identifier_b), (match_val_23, identifier_b), (match_val_22, identifer_k) .... ] Now, what I would like to do is step through this list and find the identifier which appears first a K number of times. Naively, I could make a dict and iterate through the list AND the dict at the same time and keep a tally, breaking when the criteria is met. such as: def getnn(match_vals): tallies = defaultdict(lambda: 0) for match_val, ident in match_vals: tallies[ident] += 1 for ident, tally in tallies.iteritems(): if tally == 5: return ident I would think there is a better way to do this. Any ideas? Cheers! Chris
-- http://mail.python.org/mailman/listinfo/python-list