RE: The concurrent model of spark job/stage/task

2014-08-31 Thread Liu, Raymond
1,2 :As the docs mentioned, "if they were submitted from separate threads" say, you fork your main thread and invoke action in each thread. Job and stage is always numbered in order , while not necessary corresponding to their execute order, but generated order. In your case, If you just call mu

RE: The concurrent model of spark job/stage/task

2014-08-29 Thread linkpatrickliu
t action is too large, maybe the second action will be starved.3. I think the question is how to persist the RDD data to local disk?You could use saveAsTextFile(path) or saveAsSequenceFile(path) to persist RDD data to local dist. Hope this will help you. Best regards,Patrick Liu Date: Thu, 28 A

Re: RE: The concurrent model of spark job/stage/task

2014-08-28 Thread 35597...@qq.com
14:01 To: user Subject: RE: The concurrent model of spark job/stage/task Hi, Please see the answers following each question. If there's any mistake, please let me know. Thanks! I am not sure which mode you are running. So I will assume you are using spark-submit script to submit

RE: The concurrent model of spark job/stage/task

2014-08-28 Thread linkpatrickliu
Hi, Please see the answers following each question. If there's any mistake, please let me know. Thanks! I am not sure which mode you are running. So I will assume you are using spark-submit script to submit spark applications to spark cluster(spark-standalone or Yarn) 1. how to start 2 or more