Hi Mayur, Thanks for your response. I did write a simple test that set up a DStream with 5 batches; The batch duration is 1 second, and the 3rd batch will take extra 2 seconds, the output of the test shows that the 3rd batch causes backlog, and spark streaming does catch up on 4th and 5th batch (DStream.print was modified to output system time)
------------------------------------------- Time: 1409959708000 ms, system time: 1409959708269 ------------------------------------------- 1155 ------------------------------------------- Time: 1409959709000 ms, system time: 1409959709033 ------------------------------------------- 2255 delay 2000 ms ------------------------------------------- Time: 1409959710000 ms, system time: 1409959712036 ------------------------------------------- 3355 ------------------------------------------- Time: 1409959711000 ms, system time: 1409959712059 ------------------------------------------- 4455 ------------------------------------------- Time: 1409959712000 ms, system time: 1409959712083 ------------------------------------------- 5555 Thanks! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/how-to-choose-right-DStream-batch-interval-tp13578p13855.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org