Sean,

thanks for your message!

On Mon, Feb 23, 2015 at 6:03 PM, Sean Owen <so...@cloudera.com> wrote:
>
> What I haven't investigated is whether you can enable checkpointing
> for the state in updateStateByKey separately from this mechanism,
> which is exactly your question. What happens if you set a checkpoint
> dir, but do *not* use StreamingContext.getOrCreate, but *do* call
> DStream.checkpoint?
>

I didn't even use StreamingContext.getOrCreate(), just calling
streamingContext.checkpoint(...) blew everything up. Well, "blew up" in the
sense that actor.OneForOneStrategy will print the stack trace of
the java.io.NotSerializableException every couple of seconds and
"something" is not going right with execution (I think).

BUT, indeed, just calling sparkContext.setCheckpointDir seems to be
sufficient for updateStateByKey! Looking at what
streamingContext.checkpoint() does, I don't get why ;-) and I am not sure
that this is a robust solution, but in fact that seems to work!

Thanks a lot,
Tobias

Reply via email to