Thanks for the information Xiangrui. I am using the following example to classify documents.
http://chimpler.wordpress.com/2014/06/11/classifiying-documents-using-naive-bayes-on-apache-spark-mllib/ I am not sure if this is the best way to convert textual data into vectors. Can you please confirm if this is the ideal solution as I could not identify any shortcomings. Also, I am splitting the data into 70/30 sets, which is same for Mahout so it should not have an impact on accuracy. Thanks, Jatin ----- Novice Big Data Programmer -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Accuracy-hit-in-classification-with-Spark-tp13773p13811.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org