Re: Multidimensional K-Means

2015-02-15 Thread Attila Tóth
Hi Sean, Thanks for the quick answer. I have not realized that I can make an RDD[Vector] with eg. val dataSet = sparkContext.makeRDD(List(Vectors.dense(10.0,20.0), Vectors.dense(20.0,30.0))) Using this KMeans.train works as it should. So my bad. Thanks again! Attila 2015-02-15 17:29 GMT+01:00

Re: Multidimensional K-Means

2015-02-15 Thread Sean Owen
Clustering operates on a large number of n-dimensional vectors. That seems to be what you are describing, and that is what the MLlib API accepts. What are you expecting that you don't find? Did you have a look at the KMeansModel that this method returns? it has a "clusterCenters" method that gives

Multidimensional K-Means

2015-02-15 Thread Attila Tóth
Dear Spark User List, I'm fairly new to Spark, trying to use it for multi-dimensional clustering (using the k-means clustering from MLib). However, based on the examples the clustering seems to work only for a single dimension (KMeans.train() accepts an RDD[Vector], which is a vector of doubles -