Hi,

>Hello.
>
>Le mar. 24 mars 2020 à 06:39, chentao...@qq.com <chentao...@qq.com> a écrit :
>>
>> Hi,
>>
>>     I have started 2 PRs to solve the problem you metioned.
>>
>>     About the "CentroidInitializer" I have a new idea:
>> Move CentroidInitializers as inner classes of "KMeansPlusPlusCluster",
>> and add a construct parameter and a property "useKMeansPlusPlus" to 
>> "KMeansPlusPlusCluster":
>> ```java
>> // Add "useKMeansPlusPlus" to "KMeansPlusPlusClusterer"
>> public class KMeansPlusPlusClusterer<T extends Clusterable> extends 
>> Clusterer<T> {
>>     public KMeansPlusPlusClusterer(final int k, final int maxIterations,
>>                                final DistanceMeasure measure,
>>                                final UniformRandomProvider random,
>>                                final EmptyClusterStrategy emptyStrategy,
>> +                             final useKMeansPlusPlus) {
>>     // ...
>> -  // Use K-means++ to choose the initial centers.
>> -  this.centroidInitializer = new KMeansPlusPlusCentroidInitializer(measure, 
>> random);
>> +  this.useKMeansPlusPlus = useKMeansPlusPlus;
>> }
>
>What if one comes up with a third way to initialize the centroids?
>If you can ensure that there is no other initialization procedure,
>a boolean is fine, if not, we could still make the existing procedures
>package-private (e.g. moving them in as classes defined within
>"KMeansPlusPlusClusterer". 

As I know the k-means has two center initialize methods, random and k-means++ 
so far,
use a boolean to choose which method to use is good enough for current use,
but there are two situations use need to implement the center initialize method 
themselves:
1. The Commoans Maths's implements is not good enough;
2. There are new center initialize methods.

>
>Also, in the current implementation of "KMeansPlusPlusClusterer", the
>initialization is not configurable ("KMeansPlusPlusCentroidInitializer").
>Perhaps we don't want to depart from the original (?) algorithm; if so,
>the new constructor could be made protected (thus simplifying the API). 

k-means++ is the recommend center initialize method for now days,
show we let user to fall back to random choose centers, that is a question need 
to tradeoff.
Show we make the API simple or rich?

>
>> public boolean isUseKMeansPlusPlus() {return this.useKMeansPlusPlus;}
>
>Why should this method be defined? 

To let user get their cluster parameters, same as "getEmptyStrategy()"

>
>Regards,
>Gilles
>
>> [...]
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
>For additional commands, e-mail: dev-h...@commons.apache.org
>
>

Reply via email to