Re: Spark MLLIB multiclass calssification

Feynman Liang Sat, 29 Aug 2015 22:53:02 -0700

I think the spark.ml logistic regression currently only supports 0/1
labels. If you need multiclass, I would suggest to look at either the
spark.ml decision trees. If you don't care too much for pipelines, then you
could use the spark.mllib logistic regression after featurizing.


On Sat, Aug 29, 2015 at 10:49 PM, Zsombor Egyed <[email protected]>
wrote:

> Thank you, I saw this before, but it is "just" a binary classification, so
> how can I extract this to multiple classification.
>
> Simply add different labels?
> e.g.:
>
>   new LabeledDocument(0L, "a b c d e spark", 1.0),
>   new LabeledDocument(1L, "b d", 0.0),
>   new LabeledDocument(2L, "hadoop f g h", 2.0),
>
>
>
>
> On Sun, Aug 30, 2015 at 7:32 AM, Feynman Liang <[email protected]>
> wrote:
>
>> I would check out the Pipeline code example
>> <https://spark.apache.org/docs/latest/ml-guide.html#example-pipeline>
>>
>> On Sat, Aug 29, 2015 at 9:23 PM, Zsombor Egyed <[email protected]>
>> wrote:
>>
>>> Hi!
>>>
>>> I want to implement a multiclass classification for documents.
>>> So I have different kinds of text files, and I want to classificate them
>>> with spark mllib in java.
>>>
>>> Do you have any code examples?
>>>
>>> Thanks!
>>>
>>> --
>>>
>>>
>>> *Egyed Zsombor *
>>> Junior Big Data Engineer
>>>
>>>
>>>
>>> Mobile: +36 70 320 65 81 | Twitter:@starschemaltd
>>>
>>> Email: [email protected] <[email protected]> | Web:
>>> www.starschema.net
>>>
>>>
>>
>
>
> --
>
>
> *Egyed Zsombor *
> Junior Big Data Engineer
>
>
>
> Mobile: +36 70 320 65 81 | Twitter:@starschemaltd
>
> Email: [email protected] <[email protected]> | Web:
> www.starschema.net
>
>

Re: Spark MLLIB multiclass calssification

Reply via email to