Hi, Any comments please. Regards,Laeeq
On Friday, April 17, 2015 11:37 AM, Laeeq Ahmed <laeeqsp...@yahoo.com.INVALID> wrote: Hi, I am working with multiple Kafka streams (23 streams) and currently I am processing them separately. I receive one stream from each topic. I have the following questions. 1. Spark streaming guide suggests to union these streams. Is it possible to get statistics of each stream even after they are unioned? 2. My calculations are not complex. I use 2 second batch interval and if I use 2 streams they get easily processed under 2 seconds by a single core. There is some shuffling involved in my application. As I increase the number of streams and the number of executors accordingly, the applications scheduling delay increases and become unmanageable in 2 seconds. As I believe this happens because with that many streams, the number of tasks increases thus the shuffling magnifies and also that all streams using the same executors. Is it possible to provide part of executors to particular stream while processing streams simultaneously? E.g. if I have 15 cores on cluster and 5 streams, 5 cores will be taken by 5 receivers and of the rest 10, can I provide 2 cores each to one of the 5 streams. Just to add, increasing the batch interval does help but I don't want to increase the batch size due to application restrictions and delayed results (The blockInterval and defaultParallelism does help to a limited extent). Please see attach file for CODE SNIPPET Regards,Laeeq --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org