Re: converting categorical values in csv file to numerical values

2015-11-05 Thread tog
If you corpus is large (nlp) this is indeed the best solution otherwise (few words I.e. Categories) I guess you will end up with the same result On Friday, 6 November 2015, Balachandar R.A. wrote: > Hi Guillaume, > > > This is always an option. However, I read about HashingTF which exactly > do

Re: converting categorical values in csv file to numerical values

2015-11-05 Thread Balachandar R.A.
Hi Guillaume, This is always an option. However, I read about HashingTF which exactly does this quite efficiently and can scale too. Hence, looking for a solution using this technique. regards Bala On 5 November 2015 at 18:50, tog wrote: > Hi Bala > > Can't you do a simple dictionnary and m

Re: converting categorical values in csv file to numerical values

2015-11-05 Thread tog
Hi Bala Can't you do a simple dictionnary and map those values to numbers? Cheers Guillaume On 5 November 2015 at 09:54, Balachandar R.A. wrote: > HI > > > I am new to spark MLlib and machine learning. I have a csv file that > consists of around 100 thousand rows and 20 columns. Of these 20 co