i use number of clusters = 120 https://drive.google.com/file/d/0Bxs_ao6uuBDUZFByNVgzd0Jrdm8/view?usp=sharing
seems better, but still has a long distance to be perfect On Friday, August 4, 2017 at 10:09:59 PM UTC+8, Ho Yeung Lee wrote: > actually i am using python's kmeans library. it is relevant in python > > i had changed to use kmeans > > https://gist.github.com/hoyeunglee/2475391ad554e3d2b2a40ec24ab47940 > > i do not know whether write it correctly > but it seems can cluster to find words in window, but not perfect > > > > On Friday, August 4, 2017 at 8:24:54 PM UTC+8, Alain Ketterlin wrote: > > Ho Yeung Lee <jobmatt...@gmail.com> writes: > > > > > i find kmeans has to input number of cluster > > [...] > > > > https://en.wikipedia.org/wiki/Determining_the_number_of_clusters_in_a_data_set > > > > Completely off-topic on this group/list, please direct your questions > > elsewhere. > > > > -- Alain. -- https://mail.python.org/mailman/listinfo/python-list