Re: Kafka Offsets after application is restarted using Spark Streaming Checkpointing

2015-11-15 Thread kundan kumar
Sure Thanks !! On Sun, Nov 15, 2015 at 9:13 PM, Cody Koeninger wrote: > Not sure on that, maybe someone else can chime in > > On Sat, Nov 14, 2015 at 4:51 AM, kundan kumar > wrote: > >> Hi Cody , >> >> Thanks for the clarification. I will try to come up with some workaround. >> >> I have an an

Re: Kafka Offsets after application is restarted using Spark Streaming Checkpointing

2015-11-15 Thread Cody Koeninger
Not sure on that, maybe someone else can chime in On Sat, Nov 14, 2015 at 4:51 AM, kundan kumar wrote: > Hi Cody , > > Thanks for the clarification. I will try to come up with some workaround. > > I have an another doubt. When my job is restarted, and recovers from the > checkpoint it does the r

Re: Kafka Offsets after application is restarted using Spark Streaming Checkpointing

2015-11-14 Thread kundan kumar
Hi Cody , Thanks for the clarification. I will try to come up with some workaround. I have an another doubt. When my job is restarted, and recovers from the checkpoint it does the re-partitioning step twice for each 15 minute job until the window of 2 hours is complete. Then the re-partitioning

Re: Kafka Offsets after application is restarted using Spark Streaming Checkpointing

2015-11-13 Thread Cody Koeninger
Unless you change maxRatePerPartition, a batch is going to contain all of the offsets from the last known processed to the highest available. Offsets are not time-based, and Kafka's time-based api currently has very poor granularity (it's based on filesystem timestamp of the log segment). There's