Hello.

[Message formatting is fine now.  Thanks!]

Le mer. 26 févr. 2020 à 15:20, chentao...@qq.com <chentao...@qq.com> a écrit :
>
> Hi,
>
> >Hello.
> >
> >[Please try and set your mail client to send plain text messages.]
> >
> >Le mer. 26 févr. 2020 à 14:05, CT <chentao...@qq.com> a écrit :
> >>
> >> Hi Gilles,
> >> ------------------&nbsp;Original&nbsp;------------------
> >> From:&nbsp;"GillesSadowski"<gillese...@gmail.com&gt;;
> >> Date:&nbsp;Wed, Feb 26, 2020 05:41 PM
> >> To:&nbsp;"Commons Developers List"<dev@commons.apache.org&gt;;
> >>
> >> Subject:&nbsp;Re: [math] Discuss: New feature MiniBatchKMeansClusterer
> >>
> >>
> >>
> > [...]
> >>
> >> Do you mean I should fire a JIRA issue about reuse&nbsp;"centroidOf" and 
> >> "chooseInitialCenters",
> >> then start a PR and a disscuss about "ClusterUtils"?
> >> And then&nbsp;start the PR of "MiniBatchKMeansClusterer" after all done?
> >
> >I cannot guarantee that the whole process will be streamlined.
> >In effect, you can work on multiple branches (one for each
> >prospective PR).
> >I'd say that you should start by describing (here on the ML) the
> >rationale for "ClusterUtils" (and contrast it with say, a common
> >base class).
> >[Only when the design has been agreed on,  a JIRA issue to
> >implement it should be created in order to track the actual
> >coding work).]
>
> OK, I think we should start from here:
>
> The method "centroidOf"  and "chooseInitialCenters" in KMeansPlusPlusClusterer
>  could be reused by other KMeans Clusterer like MiniBatchKMeansClusterer 
> which I want to implement.
>
> There are two solution for reuse "centroidOf"  and "chooseInitialCenters":
> 1. Extract a abstract class for KMeans Clusterer named 
> "AbstractKMeansClusterer",
>  and move "centroidOf"  and "chooseInitialCenters" as protected methods in it;
>  the EmptyClusterStrategy and related logic can also move to the 
> "AbstractKMeansClusterer".
> 2. Create a static utility class, and move "centroidOf"  and 
> "chooseInitialCenters" in it,
>  and some useful clustering method like predict(Predict which cluster is best 
> for a specified point) can put in it.
>

At first sight, I prefer option 1.
Indeed, o.a things "chooseInitialCenters" is a method that is of no interest to
users of the functionality (and so should not be part of the "public" API).
Method "centroidOf" looks generally useful.  Shouldn't it be part of
the "Cluster"
interface?  What is the difference with method "getCenter" (define by class
"CentroidCluster")?

Regards,
Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to