Sean, thanks for your message!
On Mon, Feb 23, 2015 at 6:03 PM, Sean Owen <so...@cloudera.com> wrote: > > What I haven't investigated is whether you can enable checkpointing > for the state in updateStateByKey separately from this mechanism, > which is exactly your question. What happens if you set a checkpoint > dir, but do *not* use StreamingContext.getOrCreate, but *do* call > DStream.checkpoint? > I didn't even use StreamingContext.getOrCreate(), just calling streamingContext.checkpoint(...) blew everything up. Well, "blew up" in the sense that actor.OneForOneStrategy will print the stack trace of the java.io.NotSerializableException every couple of seconds and "something" is not going right with execution (I think). BUT, indeed, just calling sparkContext.setCheckpointDir seems to be sufficient for updateStateByKey! Looking at what streamingContext.checkpoint() does, I don't get why ;-) and I am not sure that this is a robust solution, but in fact that seems to work! Thanks a lot, Tobias