You can actually submit multiple jobs to a single SparkContext in different
threads. In the case you mentioned with 2 stages having a common parent,
both will wait for the parent stage to complete and then the two will
execute in parallel, sharing the cluster resources.
Solutions that submit multi
Reading the documentation a little more closely, I'm using the wrong
terminology. I'm using stages to refer to what spark is calling a job. I
guess application (more than one spark context) is what I'm asking about
On Dec 5, 2014 5:19 PM, "Corey Nolet" wrote:
> I've read in the documentation that
I've read in the documentation that RDDs can be run concurrently when
submitted in separate threads. I'm curious how the scheduler would handle
propagating these down to the tasks.
I have 3 RDDs:
- one RDD which loads some initial data, transforms it and caches it
- two RDDs which use the cached R