task

35597...@qq.com Thu, 28 Aug 2014 18:36:08 -0700

hi, guys

  I am trying to understand how spark work on the concurrent model. I read 
below from https://spark.apache.org/docs/1.0.2/job-scheduling.html


quote
" Inside a given Spark application (SparkContext instance), multiple parallel 
jobs can run simultaneously if they were submitted from separate threads. By 
“job”, in this section, we mean a Spark action (e.g. save, collect) and any 
tasks that need to run to evaluate that action. Spark’s scheduler is fully 
thread-safe and supports this use case to enable applications that serve 
multiple requests (e.g. queries for multiple users)."

I searched everywhere but not get:
1. how to start 2 or more jobs in one spark driver, in java code.. I wrote 2 
actions in the code, but the job still staged in index 0, 1, 2, 3... looks they 
run secquencly.
2. are the stages run currently? because they always number in order 0, 1. 2. 
3.. I obverserved on the spark stage UI.
3. Can I retrieve the data out of RDD? like populate a pojo myself and compute 
on it.

Thanks in advance, guys.



35597...@qq.com

The concurrent model of spark job/stage/task

Reply via email to