Re: submissionTime vs batchTime, DirectKafka

Cody Koeninger Wed, 09 Mar 2016 09:16:41 -0800

Spark streaming by default will not start processing a batch until the
current batch is finished.  So if your processing time is larger than
your batch time, delays will build up.


On Wed, Mar 9, 2016 at 11:09 AM, Sachin Aggarwal
<different.sac...@gmail.com> wrote:
> Hi All,
>
> we have batchTime and submissionTime.
>
> @param batchTime   Time of the batch
>
> @param submissionTime  Clock time of when jobs of this batch was submitted
> to the streaming scheduler queue
>
> 1) we are seeing difference between batchTime and submissionTime for small
> batches(300ms) even in minutes for direct kafka this we see, only when the
> processing time is more than the batch interval. how can we explain this
> delay??
>
> 2) In one of case batch processing time is more then batch interval, then
> will spark fetch the next batch data from kafka parallelly processing the
> current batch or it will wait for current batch to finish first ?
>
> I would be thankful if you give me some pointers
>
> Thanks!
> --
>
> Thanks & Regards
>
> Sachin Aggarwal
> 7760502772

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Re: submissionTime vs batchTime, DirectKafka

Reply via email to