Hi all!
Here's a part of a Scala recursion that produces a stack overflow after many
recursions. I've tried many things but I've not managed to solve it.
val eRDD: RDD[(Int,Int)] = ...
val oldRDD: RDD[Int,Int]= ...
val result = *Algorithm*(eRDD,oldRDD)
*Algorithm*(eRDD: RDD[(Int,Int)] , oldRDD: RDD[(Int,Int)]) : RDD[(Int,Int)]{
val newRDD = *Transformation*(eRDD,oldRDD)//only transformations
if(*Compare*(oldRDD,newRDD)) //Compare has the "take" action!!
return *Algorithm*(eRDD,newRDD)
else
return newRDD
}
The above code is recursive and performs many iterations (until the compare
returns false)
After some iterations I get a stack overflow error. Probably the lineage
chain has become too long. Is there any way to solve this problem?
(persist/unpersist, checkpoint, sc.saveAsObjectFile).
Note1: Only compare function performs Actions on RDDs
Note2: I tried some combinations of persist/unpersist but none of them
worked!
I tried checkpointing from spark.streaming. I put a checkpoint at every
recursion but still received an overflow error
I also tried using sc.saveAsObjectFile per iteration and then reading from
file (sc.objectFile) during the next iteration. Unfortunately I noticed that
the folders are created per iteration are increasing while I was expecting
from them to have equal size per iteration.
please help!!
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Stack-overflow-error-caused-by-long-lineage-RDD-created-after-many-recursions-tp25240.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]