Hi Donni, I believe that the canopy clustering algorithm will do what you want, though I haven't played around with it myself yet. The clustering chapter in the 'Mahout in Action' book covers this fairly well.
Cheers, Sean *----------------------Dr Sean FarrellData Scientist* On Tue, Nov 18, 2014 at 12:01 AM, Donni Khan <[email protected]> wrote: > Hi All, > > I'm working with text clustering. I want to select specific documents(as a > vectors) to be centroIDs fo k-means. > I have created the TF-IDF for my dataset by using Mahout, and I would like > to choose the initioal clusters from TFIDF vectors. > > Anyone has an idea Hw I can do it by Mahout? > > Many thanks in advance. > Donni >
