What is your checkpoint interval of the updateStateByKey's DStream?
Did you modify it? Also, do you have a simple program and the
step-by-step process by which I can reproduce the issue? If not, can
you give me the full DEBUG level logs of the program before and after
restart?

TD

On Mon, Aug 4, 2014 at 8:47 AM, Yana Kadiyska <yana.kadiy...@gmail.com> wrote:
> Hi Spark users,
>
> I'm trying to get a pretty simple streaming program going. My context
> is created via  StreamingContext.getOrCreate(checkpointDir,createFn)
>
> creating a context works fine but when trying to start from a
> checkpoint I get a stack overflow.
> Any pointers what could be going wrong? My batch size is 10seconds,
> the program does a pretty simple "updateStateByKey". It did die
> abnrmtally. So two questions:
>
> 1. What could be causing a stack so deep?
> 2. What is the way to fix this (deleting everything in the checkpoint
> directory fixed it but is clearly not a good idea)
>
>
>
> 14/08/04 15:33:25 INFO FileInputDStream: Set context for
> org.apache.spark.streaming.dstream.FileInputDStream@ed3ff7a
> 14/08/04 15:33:25 INFO FileInputDStream: Restoring checkpoint data
> 14/08/04 15:33:27 INFO FileInputDStream: Restored checkpoint data
> 14/08/04 15:33:27 INFO FileInputDStream: Restoring checkpoint data
> 14/08/04 15:33:27 INFO FileInputDStream: Restored checkpoint data
> Exception in thread "main" java.lang.StackOverflowError
>         at 
> org.apache.spark.streaming.dstream.MappedDStream.slideDuration(MappedDStream.scala:32)
>         at 
> org.apache.spark.streaming.dstream.FilteredDStream.slideDuration(FilteredDStream.scala:32)
>         at 
> org.apache.spark.streaming.dstream.MappedDStream.slideDuration(MappedDStream.scala:32)
>         at 
> org.apache.spark.streaming.dstream.StateDStream.slideDuration(StateDStream.scala:40)
>         at 
> org.apache.spark.streaming.dstream.DStream.isTimeValid(DStream.scala:265)
>         at 
> org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:291)
>         at 
> org.apache.spark.streaming.dstream.StateDStream.compute(StateDStream.scala:47)
>         at 
> org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:292)
>         at 
> org.apache.spark.streaming.dstream.StateDStream.compute(StateDStream.scala:47)
>         at 
> org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:292)
>         at 
> org.apache.spark.streaming.dstream.StateDStream.compute(StateDStream.scala:47)
>         at 
> org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:292)\
>        ...(the last 2 lines repeat)
>
> thanks for any insights.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to