subject:"Why training data in Kmeans Spark streaming clustering"

Re: Why training data in Kmeans Spark streaming clustering

2016-08-11 Thread Bryan Cutler

The algorithm update is just broken into 2 steps: trainOn - to learn/update the cluster centers, and predictOn - predicts cluster assignment on data The StreamingKMeansExample you reference breaks up data into training and test because you might want to score the predictions. If you don't care ab

Why training data in Kmeans Spark streaming clustering

2016-08-11 Thread Ahmed Sadek

Dear All, I was wondering why there is training data and testing data in kmeans ? Shouldn't it be unsupervised learning with just access to stream data ? I found similar question but couldn't understand the answer. http://stackoverflow.com/questions/30972057/is-the-streaming-k-means-clustering-pr