Re: Not able run multiple tasks in parallel, spark streaming

2015-04-22 Thread Abhay Bansal
Thanks for your suggestions. sc.set("spark.streaming.concurrentJobs","2") works, but I am not sure of using it in production. @TD: The number of streams that we are interacting with are very large. Managing these many applications would just be an overhead. Moreover there are other operation whic

Re: Not able run multiple tasks in parallel, spark streaming

2015-04-22 Thread Tathagata Das
Furthermore, just to explain, doing arr.par.foreach does not help because it not really running anything, it only doing setup of the computation. Doing the setup in parallel does not mean that the jobs will be done concurrently. Also, from your code it seems like your pairs of dstreams dont intera

Re: Not able run multiple tasks in parallel, spark streaming

2015-04-21 Thread Akhil Das
You can enable this flag to run multiple jobs concurrently, It might not be production ready, but you can give it a try: sc.set("spark.streaming.concurrentJobs","2") ​Refer to TD's answer here

Not able run multiple tasks in parallel, spark streaming

2015-04-21 Thread Abhay Bansal
Hi, I have use case wherein I have to join multiple kafka topics in parallel. So if there are 2n topics there is a one to one mapping of topics which needs to be joined. val arr= ... for(condition) { val dStream1 = KafkaUtils.createDirectStream[String, String, StringDecoder, St