Hi Spark users,

I'm trying to get a pretty simple streaming program going. My context
is created via  StreamingContext.getOrCreate(checkpointDir,createFn)

creating a context works fine but when trying to start from a
checkpoint I get a stack overflow.
Any pointers what could be going wrong? My batch size is 10seconds,
the program does a pretty simple "updateStateByKey". It did die
abnrmtally. So two questions:

1. What could be causing a stack so deep?
2. What is the way to fix this (deleting everything in the checkpoint
directory fixed it but is clearly not a good idea)



14/08/04 15:33:25 INFO FileInputDStream: Set context for
org.apache.spark.streaming.dstream.FileInputDStream@ed3ff7a
14/08/04 15:33:25 INFO FileInputDStream: Restoring checkpoint data
14/08/04 15:33:27 INFO FileInputDStream: Restored checkpoint data
14/08/04 15:33:27 INFO FileInputDStream: Restoring checkpoint data
14/08/04 15:33:27 INFO FileInputDStream: Restored checkpoint data
Exception in thread "main" java.lang.StackOverflowError
        at 
org.apache.spark.streaming.dstream.MappedDStream.slideDuration(MappedDStream.scala:32)
        at 
org.apache.spark.streaming.dstream.FilteredDStream.slideDuration(FilteredDStream.scala:32)
        at 
org.apache.spark.streaming.dstream.MappedDStream.slideDuration(MappedDStream.scala:32)
        at 
org.apache.spark.streaming.dstream.StateDStream.slideDuration(StateDStream.scala:40)
        at 
org.apache.spark.streaming.dstream.DStream.isTimeValid(DStream.scala:265)
        at 
org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:291)
        at 
org.apache.spark.streaming.dstream.StateDStream.compute(StateDStream.scala:47)
        at 
org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:292)
        at 
org.apache.spark.streaming.dstream.StateDStream.compute(StateDStream.scala:47)
        at 
org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:292)
        at 
org.apache.spark.streaming.dstream.StateDStream.compute(StateDStream.scala:47)
        at 
org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:292)\
       ...(the last 2 lines repeat)

thanks for any insights.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to