I have a self organizing net which aim is clustering words. Let's think the clustering is about their 2-grams set. Words then are instances of this class.
class clusterable(str): def __abs__(self):# the set of q-grams (to be calculated only once) return set([(self+self[0])[n:n+2] for n in range(len(self))]) def __sub__(self,other): # the q-grams distance between 2 words set1=abs(self) set2=abs(other) return len(set1|set2)-len(set1&set2) I'm looking for the medium of a set of words, as the word which minimizes the sum of the distances from those words. Aka:sum([medium-word for word in words]) Thanks for ideas, Paolino ___________________________________ Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB http://mail.yahoo.it -- http://mail.python.org/mailman/listinfo/python-list