Thank you. It works. (I've applied the changed source code to my local 1.0.0 source)
-----Original Message----- From: Sean Owen [mailto:so...@cloudera.com] Sent: Friday, July 25, 2014 11:47 PM To: user@spark.apache.org Subject: Re: Strange exception on coalesce() I'm pretty sure this was already fixed last week in SPARK-2414: https://github.com/apache/spark/commit/7c23c0dc3ed721c95690fc49f435d9de6952523c On Fri, Jul 25, 2014 at 1:34 PM, innowireless TaeYun Kim <taeyun....@innowireless.co.kr> wrote: > Hi, > I'm using Spark 1.0.0. > > On filter() - map() - coalesce() - saveAsText() sequence, the > following exception is thrown. > > Exception in thread "main" java.util.NoSuchElementException: None.get > at scala.None$.get(Option.scala:313) > at scala.None$.get(Option.scala:311) > at > org.apache.spark.rdd.PartitionCoalescer.setupGroups(CoalescedRDD.scala:270) > at org.apache.spark.rdd.PartitionCoalescer.run(CoalescedRDD.scala:337) > at > org.apache.spark.rdd.CoalescedRDD.getPartitions(CoalescedRDD.scala:83) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:202) > at scala.Option.getOrElse(Option.scala:120) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:202) > at org.apache.spark.rdd.MappedRDD.getPartitions(MappedRDD.scala:28) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:202) > at scala.Option.getOrElse(Option.scala:120) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:202) > at org.apache.spark.SparkContext.runJob(SparkContext.scala:1086) > at > org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopDataset(PairRDDFunct > ions.s > cala:788) > at > org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunction > s.scal > a:674) > at > org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunction > s.scal > a:593) > at org.apache.spark.rdd.RDD.saveAsTextFile(RDD.scala:1068) > at > org.apache.spark.api.java.JavaRDDLike$class.saveAsTextFile(JavaRDDLike > .scala > :436) > at > org.apache.spark.api.java.JavaRDD.saveAsTextFile(JavaRDD.scala:29) > > The partition count of the original rdd is 306. > > When the argument of coalesce() is one of 59, 60, 61, 62, 63, the > exception above is thrown. > > But the argument is one of 50, 55, 58, 64, 65, 80, 100, the exception > is not thrown. (I've not tried other values, I think that they will be > ok.) > > Is there any magic number for the argument of coalesce() ? > > Thanks. > >