I'm trying to workaround the StackOverflowError when an object have a long dependency chain, someone said I should use checkpoint to cuts off dependencies. I write a sample code to test it, but I can only checkpoint edges but not vertices. I think I do materialize vertices and edges after calling checkpoint, why only edge been checkpointed?
Here is my code, really appreciate if you can point out what I did wrong. def main(args: Array[String]) { val conf = new SparkConf().setAppName("Test") .setMaster("local[4]") val sc = new SparkContext(conf) sc.setCheckpointDir("./checkpoint") val v = sc.parallelize(Seq[(VertexId, Long)]((0L, 0L), (1L, 1L), (2L, 2L))) val e = sc.parallelize(Seq[Edge[Long]](Edge(0L, 1L, 0L), Edge(1L, 2L, 1L), Edge(2L, 0L, 2L))) var g = Graph(v, e) val vertexIds = Seq(0L, 1L, 2L) var prevG: Graph[VertexId, Long] = null for (i <- 1 to 100000) { vertexIds.toStream.foreach(id => { println("generate new graph") prevG = g g = Graph(g.vertices, g.edges) println("uncache vertices") prevG.unpersistVertices(blocking = false) println("uncache edges") prevG.edges.unpersist(blocking = false) //Third approach, do checkpoint //Vertices can not be checkpointed, still have StackOverflowError g.vertices.checkpoint() g.edges.checkpoint() println(g.vertices.count()+g.edges.count()) println(g.vertices.isCheckpointed+" "+g.edges.isCheckpointed) }) println(" iter " + i + " finished") } } -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Can-not-checkpoint-Graph-object-s-vertices-but-could-checkpoint-edges-tp8019.html Sent from the Apache Spark User List mailing list archive at Nabble.com.