Hi,

It's supposed to work like this - share SparkContext to share datasets
between threads.

Ad 1. No
Ad 2. Yes

See CrossValidation and similar validations in spark.ml.

Jacek
On 9 Jun 2016 7:29 p.m., "Brandon White" <bwwintheho...@gmail.com> wrote:

> For example, say I want to train two Linear Regressions and two GBD Tree
> Regressions.
>
> Using different threads, Spark allows you to submit jobs at the same time
> (see: http://spark.apache.org/docs/latest/job-scheduling.html). If I
> schedule two or more training jobs and they are running at the same time:
>
> 1) Is there any risk that static worker variables or worker state could
> become corrupted leading to incorrect calculations?
> 2) Is Spark ML designed for running two or more training jobs at the same
> time? Is this something the architects consider during implementation?
>
> Thanks,
>
> Brandon
>
>
>
>

Reply via email to