Re: Spark MLLIB multiclass calssification

2015-08-29 Thread Feynman Liang
I think the spark.ml logistic regression currently only supports 0/1 labels. If you need multiclass, I would suggest to look at either the spark.ml decision trees. If you don't care too much for pipelines, then you could use the spark.mllib logistic regression after featurizing. On Sat, Aug 29, 20

Re: Spark MLLIB multiclass calssification

2015-08-29 Thread Zsombor Egyed
Thank you, I saw this before, but it is "just" a binary classification, so how can I extract this to multiple classification. Simply add different labels? e.g.: new LabeledDocument(0L, "a b c d e spark", 1.0), new LabeledDocument(1L, "b d", 0.0), new LabeledDocument(2L, "hadoop f g h", 2.0)

Re: Spark MLLIB multiclass calssification

2015-08-29 Thread Feynman Liang
I would check out the Pipeline code example On Sat, Aug 29, 2015 at 9:23 PM, Zsombor Egyed wrote: > Hi! > > I want to implement a multiclass classification for documents. > So I have different kinds of text files, and I want t