Re: [math]Discuss: There should be a CalinskiHarabaszClusterEvaluator in ml package

Gilles Sadowski Fri, 06 Mar 2020 05:05:45 -0800

Hello.

2020-03-06 9:48 UTC+01:00, chentao...@qq.com <chentao...@qq.com>:
> Hi,
>     For machine learning centroid cluster algorithm, we often use is
> Calinsk-iHarabasz score to evaluate which algorithm or how many centers is
> best for a dataset.
>     The python lib sklearn implements Calinsk-iHarabasz as
> sklearn.metrics.calinski_harabasz_score.


Could you post a reference (most of our documentation points
to "Wikipedia" or "MathWorld")?

> I think there should be a CalinskiHarabaszClusterEvaluator in commons math:

At first sight, the approach would be to define a functional
interface (with the "score" method).
Then an "enum" that would be a factory of evaluators, along
the lines of what has been done in "Commons RNG" (see class
"RandomSource"[1]).

> ```java
> package org.apache.commons.math4.ml.clustering.evaluation;
>
> import org.apache.commons.math4.ml.clustering.Cluster;
> import org.apache.commons.math4.ml.clustering.Clusterable;
>
> import java.util.List;
>
> public class CalinskiHarabaszClusterEvaluator<T extends Clusterable> extends
> ClusterEvaluator<T> {
>     @Override
>     public double score(List<? extends Cluster<T>> clusters) {
>         //TODO: Implement the Calinski-Harabasz Score algorithm
>         return 0;
>     }
>
>     @Override
>     public boolean isBetterScore(double score1, double score2) {
>         return score1 > score2;
>     }

This method does not seem very useful.

> }
> ```
>
> The code can be implemented by read the algorithm documents,
> or translate from python sklearn.metrics.calinski_harabasz_score.

What's the license of that code?

Regards,
Gilles

[1] 
https://commons.apache.org/proper/commons-rng/commons-rng-simple/javadocs/api-1.3/org/apache/commons/rng/simple/RandomSource.html

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Re: [math]Discuss: There should be a CalinskiHarabaszClusterEvaluator in ml package

Reply via email to