On 01/29/2015 08:31 PM, Ankur Dave wrote:
Thanks for the reminder. I just created a PR:
https://github.com/apache/spark/pull/4273
Ankur
Hello,
Thanks for the patch. I applied it on Pregel.scala (in Spark 1.2.0 sources) and
rebuilt
Spark. During execution, at the 25th iteration of Pregel, checkpointing is done
and then
it throws the following exception :
Exception in thread "main" org.apache.spark.SparkException: Checkpoint RDD CheckpointRDD[521] at reduce at VertexRDDImpl.scala:80(0) has
different number of partitions than original RDD VertexRDD ZippedPartitionsRDD2[518] at zipPartitions at VertexRDDImpl.scala:170(2)
at
org.apache.spark.rdd.RDDCheckpointData.doCheckpoint(RDDCheckpointData.scala:98)
at org.apache.spark.rdd.RDD.doCheckpoint(RDD.scala:1279)
at
org.apache.spark.rdd.RDD$$anonfun$doCheckpoint$1.apply(RDD.scala:1281)
at
org.apache.spark.rdd.RDD$$anonfun$doCheckpoint$1.apply(RDD.scala:1281)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.apache.spark.rdd.RDD.doCheckpoint(RDD.scala:1281)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1285)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1351)
at org.apache.spark.rdd.RDD.reduce(RDD.scala:867)
at
org.apache.spark.graphx.impl.VertexRDDImpl.count(VertexRDDImpl.scala:80)
at org.apache.spark.graphx.Pregel$.apply(Pregel.scala:155)
at
org.apache.spark.graphx.lib.ShortestPaths$.run(ShortestPaths.scala:69)
....
Pregel.scala:155 is the following line in the pregel loop:
activeMessages = messages.count()
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org