Thank you, TD ! Fang, Yan yanfang...@gmail.com +1 (206) 849-4108
On Wed, Jul 16, 2014 at 6:53 PM, Tathagata Das <tathagata.das1...@gmail.com> wrote: > After every checkpointing interval, the latest state RDD is stored to HDFS > in its entirety. Along with that, the series of DStream transformations > that was setup with the streaming context is also stored into HDFS (the > whole DAG of DStream objects is serialized and saved). > > TD > > > On Wed, Jul 16, 2014 at 5:38 PM, Yan Fang <yanfang...@gmail.com> wrote: > > > Hi guys, > > > > am wondering how the RDD checkpointing > > < > https://spark.apache.org/docs/latest/streaming-programming-guide.html#RDD > > Checkpointing> works in Spark Streaming. When I use updateStateByKey, > does > > the Spark store the entire state (at one time point) into the HDFS or > only > > put the transformation into the HDFS? Thank you. > > > > Best, > > > > Fang, Yan > > yanfang...@gmail.com > > +1 (206) 849-4108 > > >