Hi, >Hello. > >Le mar. 24 mars 2020 à 06:39, chentao...@qq.com <chentao...@qq.com> a écrit : >> >> Hi, >> >> I have started 2 PRs to solve the problem you metioned. >> >> About the "CentroidInitializer" I have a new idea: >> Move CentroidInitializers as inner classes of "KMeansPlusPlusCluster", >> and add a construct parameter and a property "useKMeansPlusPlus" to >> "KMeansPlusPlusCluster": >> ```java >> // Add "useKMeansPlusPlus" to "KMeansPlusPlusClusterer" >> public class KMeansPlusPlusClusterer<T extends Clusterable> extends >> Clusterer<T> { >> public KMeansPlusPlusClusterer(final int k, final int maxIterations, >> final DistanceMeasure measure, >> final UniformRandomProvider random, >> final EmptyClusterStrategy emptyStrategy, >> + final useKMeansPlusPlus) { >> // ... >> - // Use K-means++ to choose the initial centers. >> - this.centroidInitializer = new KMeansPlusPlusCentroidInitializer(measure, >> random); >> + this.useKMeansPlusPlus = useKMeansPlusPlus; >> } > >What if one comes up with a third way to initialize the centroids? >If you can ensure that there is no other initialization procedure, >a boolean is fine, if not, we could still make the existing procedures >package-private (e.g. moving them in as classes defined within >"KMeansPlusPlusClusterer".
As I know the k-means has two center initialize methods, random and k-means++ so far, use a boolean to choose which method to use is good enough for current use, but there are two situations use need to implement the center initialize method themselves: 1. The Commoans Maths's implements is not good enough; 2. There are new center initialize methods. > >Also, in the current implementation of "KMeansPlusPlusClusterer", the >initialization is not configurable ("KMeansPlusPlusCentroidInitializer"). >Perhaps we don't want to depart from the original (?) algorithm; if so, >the new constructor could be made protected (thus simplifying the API). k-means++ is the recommend center initialize method for now days, show we let user to fall back to random choose centers, that is a question need to tradeoff. Show we make the API simple or rich? > >> public boolean isUseKMeansPlusPlus() {return this.useKMeansPlusPlus;} > >Why should this method be defined? To let user get their cluster parameters, same as "getEmptyStrategy()" > >Regards, >Gilles > >> [...] > >--------------------------------------------------------------------- >To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >For additional commands, e-mail: dev-h...@commons.apache.org > >