Hi,
I'm using Spark 1.0.0.

On filter() - map() - coalesce() - saveAsText() sequence, the following
exception is thrown.

Exception in thread "main" java.util.NoSuchElementException: None.get
    at scala.None$.get(Option.scala:313)
    at scala.None$.get(Option.scala:311)
    at
org.apache.spark.rdd.PartitionCoalescer.setupGroups(CoalescedRDD.scala:270)
    at org.apache.spark.rdd.PartitionCoalescer.run(CoalescedRDD.scala:337)
    at
org.apache.spark.rdd.CoalescedRDD.getPartitions(CoalescedRDD.scala:83)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:202)
    at scala.Option.getOrElse(Option.scala:120)
    at org.apache.spark.rdd.RDD.partitions(RDD.scala:202)
    at org.apache.spark.rdd.MappedRDD.getPartitions(MappedRDD.scala:28)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:202)
    at scala.Option.getOrElse(Option.scala:120)
    at org.apache.spark.rdd.RDD.partitions(RDD.scala:202)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1086)
    at
org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopDataset(PairRDDFunctions.s
cala:788)
    at
org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scal
a:674)
    at
org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scal
a:593)
    at org.apache.spark.rdd.RDD.saveAsTextFile(RDD.scala:1068)
    at
org.apache.spark.api.java.JavaRDDLike$class.saveAsTextFile(JavaRDDLike.scala
:436)
    at org.apache.spark.api.java.JavaRDD.saveAsTextFile(JavaRDD.scala:29)

The partition count of the original rdd is 306.

When the argument of coalesce() is one of 59, 60, 61, 62, 63, the exception
above is thrown.

But the argument is one of 50, 55, 58, 64, 65, 80, 100, the exception is not
thrown. (I've not tried other values, I think that they will be ok.)

Is there any magic number for the argument of coalesce() ?

Thanks.


Reply via email to