ay, September 5, 2017 at 10:40 AM
To: Patrick McCarthy
Cc: "Timsina, Prem" , "user@spark.apache.org"
Subject: Re: Apache Spark: Parallelization of Multiple Machine Learning
ALgorithm
Hi Prem,
How large is your dataset? Can it be fitted in a single node?
If no, Spark ML
Is there a way to parallelize multiple ML algorithms in Spark. My use case is
something like this:
A) Run multiple machine learning algorithm (Naive Bayes, ANN, Random Forest,
etc.) in parallel.
1) Validate each algorithm using 10-fold cross-validation
B) Feed the output of step A) in second laye