Yes, update state by key worked. Though there are some more complications. On Oct 30, 2015 8:27 AM, "skaarthik oss" <skaarthik....@gmail.com> wrote:
> Did you consider UpdateStateByKey operation? > > > > *From:* Sandeep Giri [mailto:sand...@knowbigdata.com] > *Sent:* Thursday, October 29, 2015 3:09 PM > *To:* user <u...@spark.apache.org>; dev <dev@spark.apache.org> > *Subject:* Maintaining overall cumulative data in Spark Streaming > > > > Dear All, > > > > If a continuous stream of text is coming in and you have to keep > publishing the overall word count so far since 0:00 today, what would you > do? > > > > Publishing the results for a window is easy but if we have to keep > aggregating the results, how to go about it? > > > > I have tried to keep an StreamRDD with aggregated count and keep doing a > fullouterjoin but didn't work. Seems like the StreamRDD gets reset. > > > > Kindly help. > > > > Regards, > > Sandeep Giri > > >