Hello everyone, I'm trying to use MLlib's K-mean algorithm.
I tried it on raw data, Here is a example of a line contained in my input data set: 82.9817 3281.4495 with those parameters: *numClusters*=4 *numIterations*=20 results: *WSSSE = 6.375371241589461E9* Then I normalized my data: 0.02219046937793337492 0.97780953062206662508 With the same parameters, result is now: *WSSSE= 0.04229916511906393* Is it normal that normalization improve my results? Why isn't the WSSSE normalized? Because it seems that having smaller values end to a smaller WSSSE I'm sure I missed something here! Florent -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/k-mean-result-interpretation-tp17748.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org