You'll need to keep track of the offsets. On Fri, Dec 18, 2015 at 9:51 AM, sri hari kali charan Tummala < kali.tumm...@gmail.com> wrote:
> Hi Cody, > > KafkaUtils.createRDD totally make sense now I can run my spark job once in > 15 minutes extract data out of kafka and stop ..., I rely on kafka offset > for Incremental data am I right ? so no duplicate data will be returned. > > > Thanks > Sri > > > > > > On Fri, Dec 18, 2015 at 2:41 PM, Cody Koeninger <c...@koeninger.org> > wrote: > >> If you're really doing a daily batch job, have you considered just using >> KafkaUtils.createRDD rather than a streaming job? >> >> On Fri, Dec 18, 2015 at 5:04 AM, kali.tumm...@gmail.com < >> kali.tumm...@gmail.com> wrote: >> >>> Hi All, >>> >>> Imagine I have a Production spark streaming kafka (direct connection) >>> subscriber and publisher jobs running which publish and subscriber >>> (receive) >>> data from a kafka topic and I save one day's worth of data using >>> dstream.slice to Cassandra daily table (so I create daily table before >>> running spark streaming job). >>> >>> My question if all the above code runs in some scheduler like autosys how >>> should I say to spark publisher to stop publishing as it is End of day >>> and >>> to spark subscriber to stop receiving to stop receiving without killing >>> the >>> jobs ? if I kill my autosys scheduler turns red saying the job had failed >>> etc ... >>> Is there a way to stop both subscriber and publisher with out killing or >>> terminating the code. >>> >>> Thanks >>> >>> >>> >>> >>> -- >>> View this message in context: >>> http://apache-spark-user-list.1001560.n3.nabble.com/how-to-turn-off-spark-streaming-gracefully-tp25734.html >>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> For additional commands, e-mail: user-h...@spark.apache.org >>> >>> >> > > > -- > Thanks & Regards > Sri Tummala > >