Re: Prediction using Classification with text attributes in Apache Spark MLLib

2017-10-20 Thread lmk
Trying to improve the old solution. Do we have a better text classifier now in Spark Mllib? Regards, lmk -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user-unsubscr...@spark.apach

Re: Prediction using Classification with text attributes in Apache Spark MLLib

2014-11-02 Thread Xiangrui Meng
This operation requires two transformers: 1) Indexer, which maps string features into categorical features 2) OneHotEncoder, which flatten categorical features into binary features We are working on the new dataset implementation, so we can easily express those transformations. Sorry for late! If

RE: Prediction using Classification with text attributes in Apache Spark MLLib

2014-11-02 Thread ashu
Hi, Sorry to bounce back the old thread. What is the state now? Is this problem solved. How spark handle categorical data now? Regards, Ashutosh -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Prediction-using-Classification-with-text-attributes-in-Apac

RE: Prediction using Classification with text attributes in Apache Spark MLLib

2014-06-26 Thread lmk
Thanks Alexander, That gave me a clear idea of what I can look for in MLLib. Regards, lmk -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Prediction-using-Classification-with-text-attributes-in-Apache-Spark-MLLib-tp8166p8395.html Sent from the Apache Spark

RE: Prediction using Classification with text attributes in Apache Spark MLLib

2014-06-26 Thread Ulanov, Alexander
alikrish...@gmail.com<mailto:lakshmi.muralikrish...@gmail.com>] Sent: Wednesday, June 25, 2014 1:27 PM To: u...@spark.incubator.apache.org<mailto:u...@spark.incubator.apache.org> Subject: RE: Prediction using Classification with text attributes in Apache Spark MLLib Hi Alexander, Just on

RE: Prediction using Classification with text attributes in Apache Spark MLLib

2014-06-25 Thread Debasish Das
; Sent: Wednesday, June 25, 2014 1:27 PM > To: u...@spark.incubator.apache.org > Subject: RE: Prediction using Classification with text attributes in > Apache Spark MLLib > > Hi Alexander, > Just one more question on a related note. Should I be following the same > procedure even if my data is nominal

RE: Prediction using Classification with text attributes in Apache Spark MLLib

2014-06-25 Thread Ulanov, Alexander
krish...@gmail.com] Sent: Wednesday, June 25, 2014 1:27 PM To: u...@spark.incubator.apache.org Subject: RE: Prediction using Classification with text attributes in Apache Spark MLLib Hi Alexander, Just one more question on a related note. Should I be following the same procedure even if my da

RE: Prediction using Classification with text attributes in Apache Spark MLLib

2014-06-25 Thread lmk
Hi Alexander, Just one more question on a related note. Should I be following the same procedure even if my data is nominal (categorical), but having a lot of combinations? (In Weka I used to have it as nominal data) Regards, -lmk -- View this message in context: http://apache-spark-user-list.

Re: Prediction using Classification with text attributes in Apache Spark MLLib

2014-06-24 Thread Sean Owen
On Tue, Jun 24, 2014 at 12:28 PM, Ulanov, Alexander wrote: > You need to convert your text to vector space model: > http://en.wikipedia.org/wiki/Vector_space_model > and then pass it to SVM. As far as I know, in previous versions of MLlib > there was a special class for doing this: > https://gi

RE: Prediction using Classification with text attributes in Apache Spark MLLib

2014-06-24 Thread Ulanov, Alexander
com] Sent: Tuesday, June 24, 2014 3:41 PM To: u...@spark.incubator.apache.org Subject: RE: Prediction using Classification with text attributes in Apache Spark MLLib Hi Alexander, Thanks for your prompt response. Earlier I was executing this Prediction using Weka only. But now we are moving to a hu

RE: Prediction using Classification with text attributes in Apache Spark MLLib

2014-06-24 Thread lmk
Hi Alexander, Thanks for your prompt response. Earlier I was executing this Prediction using Weka only. But now we are moving to a huge dataset and hence to Apache Spark MLLib. Is there any other way to convert to libSVM format? Or is there any other simpler algorithm that I can use in mllib? Than

RE: Prediction using Classification with text attributes in Apache Spark MLLib

2014-06-24 Thread Ulanov, Alexander
Hi, You need to convert your text to vector space model: http://en.wikipedia.org/wiki/Vector_space_model and then pass it to SVM. As far as I know, in previous versions of MLlib there was a special class for doing this: https://github.com/amplab/MLI/blob/master/src/main/scala/feat/NGrams.scala.