Re: kafka + Spark Streaming with checkPointing fails to restart

2015-05-14 Thread Ankur Chauhan
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Thanks everyone, that was the problem. the "create new streaming context" function was supposed to setup the stream processing as well as the checkpoint directory. I had missed the whole process of checkpoint setup. With that done, everything works as

Re: kafka + Spark Streaming with checkPointing fails to restart

2015-05-13 Thread NB
The data pipeline (DAG) should not be added to the StreamingContext in the case of a recovery scenario. The pipeline metadata is recovered from the checkpoint folder. That is one thing you will need to fix in your code. Also, I don't think the ssc.checkpoint(folder) call should be made in case of t

Re: kafka + Spark Streaming with checkPointing fails to restart

2015-05-13 Thread Cody Koeninger
http://spark.apache.org/docs/latest/streaming-programming-guide.html#checkpointing shows setting up your stream and calling .checkpoint(checkpointDir) inside the functionToCreateContext. It looks to me like you're setting up your stream and calling checkpoint outside, after getOrCreate. I'm not