Hi there Sounds like a fun project :)
I'd recommend getting familiar with the existing k-means implementation as well as bisecting k-means in Spark, and then implementing yours based off that. You should focus on using the new ML pipelines API, and release it as a package on spark-packages.org. If it got lots of use cases from there, it could be considered for inclusion in ML core in the future. Good luck! Sent from my iPhone > On 31 Jan 2016, at 00:23, Acelot <acelo...@gmail.com> wrote: > > Hi All, > As part of my final project at university I would try to build an alternative > version of k-means algorithm, it's called k-modes introduced here: Improving > the Accuracy and Efficiency of the k-means Clustering Algorithm paper (Link: > http://www.iaeng.org/publication/WCE2009/WCE2009_pp308-312.pdf). I would like > to know any related work. If someone is interested to work in this project > contact with me, > Kind regards, > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org > For additional commands, e-mail: dev-h...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org