Hi, I have started 2 PRs to solve the problem you metioned.
About the "CentroidInitializer" I have a new idea: Move CentroidInitializers as inner classes of "KMeansPlusPlusCluster", and add a construct parameter and a property "useKMeansPlusPlus" to "KMeansPlusPlusCluster": ```java // Add "useKMeansPlusPlus" to "KMeansPlusPlusClusterer" public class KMeansPlusPlusClusterer<T extends Clusterable> extends Clusterer<T> { public KMeansPlusPlusClusterer(final int k, final int maxIterations, final DistanceMeasure measure, final UniformRandomProvider random, final EmptyClusterStrategy emptyStrategy, + final useKMeansPlusPlus) { // ... - // Use K-means++ to choose the initial centers. - this.centroidInitializer = new KMeansPlusPlusCentroidInitializer(measure, random); + this.useKMeansPlusPlus = useKMeansPlusPlus; } public boolean isUseKMeansPlusPlus() {return this.useKMeansPlusPlus;} // Make "chooseInitialCenters" package-private and call "CentroidInitializer.selectCentroids" // Then the chooseInitialCenters can be reused by "MiniBatchKMeans". List<CentroidCluster<T>> chooseInitialCenters(final Collection<T> points){ // Use K-means++ to choose the initial centers. final CentroidInitializer centroidInitializer = useKMeansPlusPlus? new KMeansPlusPlusCentroidInitializer(this.measure, this.random) :new RandomCentroidInitializer(this.random); return centroidInitializer.selectCentroids(points, this.k); } // Make CentroidInitializer private private static interface CentroidInitializer { <T extends Clusterable> List<CentroidCluster<T>> selectCentroids(final Collection<T> points, final int k); } private static class RandomCentroidInitializer implements CentroidInitializer {...} private static class KMeansPlusPlusCentroidInitializer implements CentroidInitializer {...} ``` The "CentroidInitializer" only used in "KMeansPlusPlusClusterer" and "MiniBatchKMeans", the other k-means based algorithm use "KMeansPlusPlusClusterer" as a parameter. ```java // Changes in "MiniBatchKMeansClusterer" public class MiniBatchKMeansClusterer<T extends Clusterable> public MiniBatchKMeansClusterer(final int k, final int maxIterations, final int batchSize, final int initIterations, final int initBatchSize, final int maxNoImprovementTimes, final DistanceMeasure measure, final UniformRandomProvider random, final EmptyClusterStrategy emptyStrategy, + final useKMeansPlusPlus) { - super(k, maxIterations, measure, random, emptyStrategy); + super(k, maxIterations, measure, random, emptyStrategy, useKMeansPlusPlus); //... } //... private List<CentroidCluster<T>> initialCenters(final List<T> points) { //... - final List<CentroidCluster<T>> clusters = getCentroidInitializer().selectCentroids(initialPoints, getK()); + final List<CentroidCluster<T>> clusters = chooseInitialCenters(initialPoints); //... } } ``` >Hi Tao. > >I've merged PR #128 but please see my comment on the JIRA page.[1] > >Thanks for your interest in improving the library, >Gilles > >[1] >https://issues.apache.org/jira/browse/MATH-1509?focusedCommentId=17064306&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17064306 > >--------------------------------------------------------------------- >To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >For additional commands, e-mail: dev-h...@commons.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org