subject:"Re\: Spark Streaming scheduling control"

Re: Spark Streaming scheduling control

2014-10-20 Thread davidkl

One detail, even forcing partitions (/repartition/), spark is still holding some tasks; if I increase the load of the system (increasing /spark.streaming.receiver.maxRate/), even if all workers are used, the one with the receiver gets twice as many tasks compared with the other workers. Total del

Re: Spark Streaming scheduling control

2014-10-20 Thread davidkl

Thanks Akhil Das-2: actually I tried setting spark.default.parallelism but no effect :-/ I am running standalone and performing a mix of map/filter/foreachRDD. I had to force parallelism with repartition to get both workers to process tasks, but I do not think this should be required (and I am n

Re: Spark Streaming scheduling control

2014-10-20 Thread Akhil Das

What operation are you performing? And what is your cluster configuration? If you are doing some operation like groupBy, reduceBy, join etc then you could try providing the level of parallelism. if you give 16, then mostly each of your worker will get 8 tasks to execute. Thanks Best Regards On Mo