What is your checkpoint interval of the updateStateByKey's DStream? Did you modify it? Also, do you have a simple program and the step-by-step process by which I can reproduce the issue? If not, can you give me the full DEBUG level logs of the program before and after restart?
TD On Mon, Aug 4, 2014 at 8:47 AM, Yana Kadiyska <yana.kadiy...@gmail.com> wrote: > Hi Spark users, > > I'm trying to get a pretty simple streaming program going. My context > is created via StreamingContext.getOrCreate(checkpointDir,createFn) > > creating a context works fine but when trying to start from a > checkpoint I get a stack overflow. > Any pointers what could be going wrong? My batch size is 10seconds, > the program does a pretty simple "updateStateByKey". It did die > abnrmtally. So two questions: > > 1. What could be causing a stack so deep? > 2. What is the way to fix this (deleting everything in the checkpoint > directory fixed it but is clearly not a good idea) > > > > 14/08/04 15:33:25 INFO FileInputDStream: Set context for > org.apache.spark.streaming.dstream.FileInputDStream@ed3ff7a > 14/08/04 15:33:25 INFO FileInputDStream: Restoring checkpoint data > 14/08/04 15:33:27 INFO FileInputDStream: Restored checkpoint data > 14/08/04 15:33:27 INFO FileInputDStream: Restoring checkpoint data > 14/08/04 15:33:27 INFO FileInputDStream: Restored checkpoint data > Exception in thread "main" java.lang.StackOverflowError > at > org.apache.spark.streaming.dstream.MappedDStream.slideDuration(MappedDStream.scala:32) > at > org.apache.spark.streaming.dstream.FilteredDStream.slideDuration(FilteredDStream.scala:32) > at > org.apache.spark.streaming.dstream.MappedDStream.slideDuration(MappedDStream.scala:32) > at > org.apache.spark.streaming.dstream.StateDStream.slideDuration(StateDStream.scala:40) > at > org.apache.spark.streaming.dstream.DStream.isTimeValid(DStream.scala:265) > at > org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:291) > at > org.apache.spark.streaming.dstream.StateDStream.compute(StateDStream.scala:47) > at > org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:292) > at > org.apache.spark.streaming.dstream.StateDStream.compute(StateDStream.scala:47) > at > org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:292) > at > org.apache.spark.streaming.dstream.StateDStream.compute(StateDStream.scala:47) > at > org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:292)\ > ...(the last 2 lines repeat) > > thanks for any insights. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org