where are we capturing this delay? I am aware of scheduling delay which is defined as processing time-submission time not the batch create time
On Wed, Mar 9, 2016 at 10:46 PM, Cody Koeninger <c...@koeninger.org> wrote: > Spark streaming by default will not start processing a batch until the > current batch is finished. So if your processing time is larger than > your batch time, delays will build up. > > On Wed, Mar 9, 2016 at 11:09 AM, Sachin Aggarwal > <different.sac...@gmail.com> wrote: > > Hi All, > > > > we have batchTime and submissionTime. > > > > @param batchTime Time of the batch > > > > @param submissionTime Clock time of when jobs of this batch was > submitted > > to the streaming scheduler queue > > > > 1) we are seeing difference between batchTime and submissionTime for > small > > batches(300ms) even in minutes for direct kafka this we see, only when > the > > processing time is more than the batch interval. how can we explain this > > delay?? > > > > 2) In one of case batch processing time is more then batch interval, then > > will spark fetch the next batch data from kafka parallelly processing the > > current batch or it will wait for current batch to finish first ? > > > > I would be thankful if you give me some pointers > > > > Thanks! > > -- > > > > Thanks & Regards > > > > Sachin Aggarwal > > 7760502772 > -- Thanks & Regards Sachin Aggarwal 7760502772