Re: How to prepare data for K means clustering

2016-01-20 Thread Chiwan Park
Hi Ashutosh, You can use basic Flink DataSet operations such as map and filter to transform your data. Basically, you have to declare a distance metric between each record in data. In example, we use euclidean distance (see euclideanDistance method in Point class). In map method in SelectNeare

How to prepare data for K means clustering

2016-01-20 Thread Ashutosh Kumar
I saw example code for K means clustering . It takes input data points as pair of double values (1.2 2.3\n5.3 7.2\.). My question is how do I convert my business data to this format. I have customer data which has columns like house hold income , education and several others. I want to do clusteri