can I reset the range based on some condition - before calling transformations on the stream.
Say - before calling : directKafkaStream.foreachRDD(new Function<JavaRDD<byte[][]>, Void>() { @Override public Void call(JavaRDD<byte[][]> v1) throws Exception { v1.foreachPartition(new VoidFunction<Iterator<byte[][]>>{ @Override public void call(Iterator<byte[][]> t) throws Exception { }});}}); change directKafkaStream's RDD's offset range.(fromOffset). I can't do this in compute method since compute would have been called at current batch queue time - but condition is set at previous batch run time. On Tue, Sep 1, 2015 at 7:09 PM, Cody Koeninger <c...@koeninger.org> wrote: > It's at the time compute() gets called, which should be near the time the > batch should have been queued. > > On Tue, Sep 1, 2015 at 8:02 AM, Shushant Arora <shushantaror...@gmail.com> > wrote: > >> Hi >> >> In spark streaming 1.3 with kafka- when does driver bring latest offsets >> of this run - at start of each batch or at time when batch gets queued ? >> >> Say few of my batches take longer time to complete than their batch >> interval. So some of batches will go in queue. Will driver waits for >> queued batches to get started or just brings the latest offsets before >> they even actually started. And when they start running they will work on >> old offsets brought at time when they were queued. >> >> >