much. It is great help, I will try spark-sklearn.
>>
>> Prem
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *From: *Yanbo Liang
>> *Date: *Tuesday, September 5, 2017 at 10:40 AM
>> *To: *Patrick McCarthy
>> *Cc: *"Timsina,
>
>
>
>
>
>
>
> *From: *Yanbo Liang
> *Date: *Tuesday, September 5, 2017 at 10:40 AM
> *To: *Patrick McCarthy
> *Cc: *"Timsina, Prem" , "user@spark.apache.org" <
> user@spark.apache.org>
> *Subject: *Re: Apache Spark: Parallelizatio
ay, September 5, 2017 at 10:40 AM
To: Patrick McCarthy
Cc: "Timsina, Prem" , "user@spark.apache.org"
Subject: Re: Apache Spark: Parallelization of Multiple Machine Learning
ALgorithm
Hi Prem,
How large is your dataset? Can it be fitted in a single node?
If no, Spark ML
Hi Prem,
How large is your dataset? Can it be fitted in a single node?
If no, Spark MLlib provide CrossValidation which can run multiple machine
learning algorithms parallel on distributed dataset and do parameter
search. FYI:
https://spark.apache.org/docs/latest/ml-tuning.html#cross-validation
If
You might benefit from watching this JIRA issue -
https://issues.apache.org/jira/browse/SPARK-19071
On Sun, Sep 3, 2017 at 5:50 PM, Timsina, Prem wrote:
> Is there a way to parallelize multiple ML algorithms in Spark. My use case
> is something like this:
>
> A) Run multiple machine learning alg