converting categorical values in csv file to numerical values

Balachandar R.A. Thu, 05 Nov 2015 01:55:46 -0800

HI


I am new to spark MLlib and machine learning. I have a csv file that
consists of around 100 thousand rows and 20 columns. Of these 20 columns,
10 contains string values. Each value in these columns are not necessarily
unique. They are kind of categorical, that is, the values could be one
amount, say 10 values. To start with, I could run examples, especially,
random forest algorithm in my local spark (1.5.1.) platform. However, I
have a challenge with my dataset due to these strings as the APIs takes
numerical values. Can any one tell me how I can map these categorical
values (strings) into numbers and use them with random forest algorithms?
Any example will be greatly appreciated.


regards

Bala

converting categorical values in csv file to numerical values

Reply via email to