Thanks for the information Xiangrui. I am using the following example to
classify documents.

http://chimpler.wordpress.com/2014/06/11/classifiying-documents-using-naive-bayes-on-apache-spark-mllib/

I am not sure if this is the best way to convert textual data into vectors.
Can you please confirm if this is the ideal solution as I could not identify
any shortcomings.

Also, I am splitting the data into 70/30 sets, which is same for Mahout so
it should not have an impact on accuracy.

Thanks,
Jatin




-----
Novice Big Data Programmer
--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Accuracy-hit-in-classification-with-Spark-tp13773p13811.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to