- Checkpointing alone isn't enough to get exactly-once semantics. Events will be replayed in case of failure. You must have idempotent output operations.
- Another way to handle upgrades is to just start a second app with the new code, then stop the old one once everything's caught up. On Tue, Apr 12, 2016 at 1:15 AM, Soumitra Siddharth Johri <soumitra.siddha...@gmail.com> wrote: > I think before doing a code update you would like to gracefully shutdown > your streaming job and checkpoint the processed offsets ( and any state that > you maintain ) in database or Hdfs. > When you start the job up it should read this checkpoint file , build the > necessary state and begin processing from the last offset processed. > > Another approach would be to checkpoint the processed offsets in the > streaming job whenever you read from Kafka . Then before reading the next > batch of offsets instead of relying on spark checkpoint for offsets, read > from the last processed offset that you saved. > > Regards > Soumitra > > On Apr 11, 2016, at 8:31 PM, Siva Gudavalli <gss.su...@gmail.com> wrote: > > Okie. That makes sense. > > Any recommendations on how to manage changes to my spark streaming app and > achieving fault tolerance at the same time > > On Mon, Apr 11, 2016 at 8:16 PM, Shixiong(Ryan) Zhu > <shixi...@databricks.com> wrote: >> >> You cannot. Streaming doesn't support it because code changes will break >> Java serialization. >> >> On Mon, Apr 11, 2016 at 4:30 PM, Siva Gudavalli <gss.su...@gmail.com> >> wrote: >>> >>> hello, >>> >>> i am writing a spark streaming application to read data from kafka. I am >>> using no receiver approach and enabled checkpointing to make sure I am not >>> reading messages again in case of failure. (exactly once semantics) >>> >>> i have a quick question how checkpointing needs to be configured to >>> handle code changes in my spark streaming app. >>> >>> can you please suggest. hope the question makes sense. >>> >>> thank you >>> >>> regards >>> shiv >> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org