I have the following list of lists that contains 6 entries: lol = [['a', 3, 1.01], ['x',5, 1.00],['k',7, 2.02],['p',8, 3.00], ['b', 10, 1.09], ['f', 12, 2.03]]
each list in lol contain 3 elements: ['a', 3, 1.01] e1 e2 e3 The list above is already sorted according to the e2 (i.e, 2nd element) I'd like to 'cluster' the above list following roughly these steps: 1. Pick the lowest entry (wrt. e2) in lol as the key of first cluster 2. Assign that as first member of the cluster (dictionary of list) 3. Calculate the difference current e3 in next list with first member of existing clusters. 3. If the difference is less than threshold, assign that list as the member of the corresponding cluster Else, create new cluster with current list as new key. 3. Repeat the rest until finish The final result will look like this, with threshold <= 0.1. dol = {'a':['a','x','b'], 'k':['k','f'], 'p':['p']} I'm stuck with this step what's the right way to do it: __BEGIN__ import json from collections import defaultdict thres = 0.1 tmp_e3 = 0 tmp_e1 = "-" lol = [['a', 3, 1.01], ['x',5, 1.00],['k',7, 2.02], ['p',8, 3.00], ['b', 10, 1.09], ['f', 12, 2.03]] dol = defaultdict(list) for thelist in lol: e1, e2, e3 = thelist if tmp_e1 == "-": tmp_e1 = e1 else: diff = abs(tmp_e3 - e3) if diff > thres: tmp_e1 = e1 dol[tmp_e1].append(e1) tmp_e1 = e1 tmp_e3 = e3 print json.dumps(dol, indent=4) __END__ Best, G.v. -- https://mail.python.org/mailman/listinfo/python-list