Re: spark streaming with kafka source, how many concurrent jobs?

2017-03-21 Thread shyla deshpande
Thanks TD. On Tue, Mar 14, 2017 at 4:37 PM, Tathagata Das wrote: > This setting allows multiple spark jobs generated through multiple > foreachRDD to run concurrently, even if they are across batches. So output > op2 from batch X, can run concurrently with op1 of batch X+1 > This is not safe bec

Re: spark streaming with kafka source, how many concurrent jobs?

2017-03-14 Thread Tathagata Das
This setting allows multiple spark jobs generated through multiple foreachRDD to run concurrently, even if they are across batches. So output op2 from batch X, can run concurrently with op1 of batch X+1 This is not safe because it breaks the checkpointing logic in subtle ways. Note that this was ne

Re: spark streaming with kafka source, how many concurrent jobs?

2017-03-14 Thread shyla deshpande
Thanks TD for the response. Can you please provide more explanation. I am having multiple streams in the spark streaming application (Spark 2.0.2 using DStreams). I know many people using this setting. So your explanation will help a lot of people. Thanks On Fri, Mar 10, 2017 at 6:24 PM, Tathag

Re: spark streaming with kafka source, how many concurrent jobs?

2017-03-10 Thread Tathagata Das
That config I not safe. Please do not use it. On Mar 10, 2017 10:03 AM, "shyla deshpande" wrote: > I have a spark streaming application which processes 3 kafka streams and > has 5 output operations. > > Not sure what should be the setting for spark.streaming.concurrentJobs. > > 1. If the concurr

spark streaming with kafka source, how many concurrent jobs?

2017-03-10 Thread shyla deshpande
I have a spark streaming application which processes 3 kafka streams and has 5 output operations. Not sure what should be the setting for spark.streaming.concurrentJobs. 1. If the concurrentJobs setting is 4 does that mean 2 output operations will be run sequentially? 2. If I had 6 cores what wo

Re: PySpark concurrent jobs using single SparkContext

2015-08-21 Thread Hemant Bhanawat
y periods. > > One solution is instead to launch multiple Spark jobs via spark-submit and > let YARN/Spark's dynamic executor allocation take care of fair scheduling. > In practice, this doesn't seem to yield very fast computation perhaps due > to some additional overhead w

PySpark concurrent jobs using single SparkContext

2015-08-20 Thread Mike Sukmanowsky
lution is instead to launch multiple Spark jobs via spark-submit and let YARN/Spark's dynamic executor allocation take care of fair scheduling. In practice, this doesn't seem to yield very fast computation perhaps due to some additional overhead with YARN. Is there any safe way to launc

Re: concurrent jobs

2014-07-18 Thread Tathagata Das
Yes your interpretation is correct. For every batch, there will be two jobs, and they will be run one after another. TD On Fri, Jul 18, 2014 at 3:06 AM, Haopu Wang wrote: > By looking at the code of JobScheduler, I find a parameter of below: > > > > *private* val numConcurrentJobs = ssc.con

concurrent jobs

2014-07-18 Thread Haopu Wang
By looking at the code of JobScheduler, I find a parameter of below: private val numConcurrentJobs = ssc.conf.getInt("spark.streaming.concurrentJobs", 1) private val jobExecutor = Executors.newFixedThreadPool(numConcurrentJobs) Does that mean each App can have only one active stage?

Re: launching concurrent jobs programmatically

2014-04-29 Thread ishaaq
shell? Our app has a number of jars that I don't particularly want to have to upload each time I want to run a small ad-hoc spark-shell session. Thanks, Ishaaq -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/launching-concurrent-jobs-programmatically-t

Re: launching concurrent jobs programmatically

2014-04-28 Thread Patrick Wendell
sing the same SparkContext or >> should I create a new one each time my app needs to run a job? >> >> Thanks, >> Ishaaq >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/launching-concurrent-jobs-programmatically-tp4990.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> > >

Re: launching concurrent jobs programmatically

2014-04-28 Thread Andrew Ash
a job? > > Thanks, > Ishaaq > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/launching-concurrent-jobs-programmatically-tp4990.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. >

launching concurrent jobs programmatically

2014-04-28 Thread ishaaq
one each time my app needs to run a job? Thanks, Ishaaq -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/launching-concurrent-jobs-programmatically-tp4990.html Sent from the Apache Spark User List mailing list archive at Nabble.com.