Re: spark streaming 1.3 with kafka connection timeout

2015-09-10 Thread Cody Koeninger
Again, that looks like you lost a kafka broker. Executors will retry failed tasks automatically up to the max failures. spark.streaming.kafka.maxRetries controls the number of times the driver will retry when attempting to get offsets. If your broker isn't up / rebalance hasn't finished after N

Re: spark streaming 1.3 with kafka connection timeout

2015-09-10 Thread Shushant Arora
My bad Got that exception in driver code of same job not in executor. But it says of socket close exception only. org.apache.spark.SparkException: ArrayBuffer(java.io.EOFException: Received -1 when reading from channel, socket has likely been closed., org.apache.spark.SparkException: Couldn't fi

Re: spark streaming 1.3 with kafka connection timeout

2015-09-10 Thread Cody Koeninger
NotLeaderForPartitionException means you lost a kafka broker or had a rebalance... why did you say " I am getting Connection tmeout in my code." You've asked questions about this exact same situation before, the answer remains the same On Thu, Sep 10, 2015 at 9:44 AM, Shushant Arora wrote: > St

Re: spark streaming 1.3 with kafka connection timeout

2015-09-10 Thread Shushant Arora
Stack trace is 15/09/09 22:49:52 ERROR kafka.KafkaRDD: Lost leader for topic topicname partition 99, sleeping for 200ms kafka.common.NotLeaderForPartitionException at sun.reflect.GeneratedConstructorAccessor26.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessor

Re: spark streaming 1.3 with kafka connection timeout

2015-09-10 Thread Cody Koeninger
Post the actual stacktrace you're getting On Thu, Sep 10, 2015 at 12:21 AM, Shushant Arora wrote: > Executors in spark streaming 1.3 fetch messages from kafka in batches and > what happens when executor takes longer time to complete a fetch batch > > say in > > > directKafkaStream.foreachRDD(new

spark streaming 1.3 with kafka connection timeout

2015-09-09 Thread Shushant Arora
Executors in spark streaming 1.3 fetch messages from kafka in batches and what happens when executor takes longer time to complete a fetch batch say in directKafkaStream.foreachRDD(new Function, Void>() { @Override public Void call(JavaRDD v1) throws Exception { v1.foreachPartition(new VoidFun

Re: spark streaming 1.3 with kafka

2015-09-01 Thread Shushant Arora
>>>> v1.foreachPartition(new VoidFunction>{ >>>>>> @Override >>>>>> public void call(Iterator t) throws Exception { >>>>>> }});}}); >>>>>> >>>>>> change directKafkaStream's RDD's offse

Re: spark streaming 1.3 with kafka

2015-09-01 Thread Cody Koeninger
directKafkaStream's RDD's offset range.(fromOffset). >>>>> >>>>> I can't do this in compute method since compute would have been called >>>>> at current batch queue time - but condition is set at previous batch run >>>>> ti

Re: spark streaming 1.3 with kafka

2015-09-01 Thread Shushant Arora
ue time - but condition is set at previous batch run >>>> time. >>>> >>>> >>>> On Tue, Sep 1, 2015 at 7:09 PM, Cody Koeninger >>>> wrote: >>>> >>>>> It's at the time compute() gets called, which should be ne

Re: spark streaming 1.3 with kafka

2015-09-01 Thread Cody Koeninger
ompute method since compute would have been called >>> at current batch queue time - but condition is set at previous batch run >>> time. >>> >>> >>> On Tue, Sep 1, 2015 at 7:09 PM, Cody Koeninger >>> wrote: >>> >>>> It'

Re: spark streaming 1.3 with kafka

2015-09-01 Thread Cody Koeninger
on is set at previous batch run time. > > > On Tue, Sep 1, 2015 at 7:09 PM, Cody Koeninger wrote: > >> It's at the time compute() gets called, which should be near the time the >> batch should have been queued. >> >> On Tue, Sep 1, 2015 at 8:02 AM, Shushant Ar

Re: spark streaming 1.3 with kafka

2015-09-01 Thread Shushant Arora
nt batch queue time - but condition is set at previous batch run time. >> >> >> On Tue, Sep 1, 2015 at 7:09 PM, Cody Koeninger >> wrote: >> >>> It's at the time compute() gets called, which should be near the time >>> the batch should have been queue

Re: spark streaming 1.3 with kafka

2015-09-01 Thread Shushant Arora
09 PM, Cody Koeninger wrote: > It's at the time compute() gets called, which should be near the time the > batch should have been queued. > > On Tue, Sep 1, 2015 at 8:02 AM, Shushant Arora > wrote: > >> Hi >> >> In spark streaming 1.3 with kafka- when does

Re: spark streaming 1.3 with kafka

2015-09-01 Thread Cody Koeninger
It's at the time compute() gets called, which should be near the time the batch should have been queued. On Tue, Sep 1, 2015 at 8:02 AM, Shushant Arora wrote: > Hi > > In spark streaming 1.3 with kafka- when does driver bring latest offsets > of this run - at start of each batc

spark streaming 1.3 with kafka

2015-09-01 Thread Shushant Arora
Hi In spark streaming 1.3 with kafka- when does driver bring latest offsets of this run - at start of each batch or at time when batch gets queued ? Say few of my batches take longer time to complete than their batch interval. So some of batches will go in queue. Will driver waits for queued