-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Thanks everyone, that was the problem. the "create new streaming
context" function was supposed to setup the stream processing as well
as the checkpoint directory. I had missed the whole process of
checkpoint setup. With that done, everything works as
The data pipeline (DAG) should not be added to the StreamingContext in the
case of a recovery scenario. The pipeline metadata is recovered from the
checkpoint folder. That is one thing you will need to fix in your code.
Also, I don't think the ssc.checkpoint(folder) call should be made in case
of t
http://spark.apache.org/docs/latest/streaming-programming-guide.html#checkpointing
shows setting up your stream and calling .checkpoint(checkpointDir) inside
the functionToCreateContext. It looks to me like you're setting up your
stream and calling checkpoint outside, after getOrCreate.
I'm not