RE: Strange exception on coalesce()

innowireless TaeYun Kim Sun, 27 Jul 2014 18:55:01 -0700

Thank you. It works.
(I've applied the changed source code to my local 1.0.0 source)


-----Original Message-----
From: Sean Owen [mailto:so...@cloudera.com] 
Sent: Friday, July 25, 2014 11:47 PM
To: user@spark.apache.org
Subject: Re: Strange exception on coalesce()

I'm pretty sure this was already fixed last week in SPARK-2414:
https://github.com/apache/spark/commit/7c23c0dc3ed721c95690fc49f435d9de6952523c

On Fri, Jul 25, 2014 at 1:34 PM, innowireless TaeYun Kim 
<taeyun....@innowireless.co.kr> wrote:
> Hi,
> I'm using Spark 1.0.0.
>
> On filter() - map() - coalesce() - saveAsText() sequence, the 
> following exception is thrown.
>
> Exception in thread "main" java.util.NoSuchElementException: None.get
>     at scala.None$.get(Option.scala:313)
>     at scala.None$.get(Option.scala:311)
>     at
> org.apache.spark.rdd.PartitionCoalescer.setupGroups(CoalescedRDD.scala:270)
>     at org.apache.spark.rdd.PartitionCoalescer.run(CoalescedRDD.scala:337)
>     at
> org.apache.spark.rdd.CoalescedRDD.getPartitions(CoalescedRDD.scala:83)
>     at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204)
>     at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:202)
>     at scala.Option.getOrElse(Option.scala:120)
>     at org.apache.spark.rdd.RDD.partitions(RDD.scala:202)
>     at org.apache.spark.rdd.MappedRDD.getPartitions(MappedRDD.scala:28)
>     at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204)
>     at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:202)
>     at scala.Option.getOrElse(Option.scala:120)
>     at org.apache.spark.rdd.RDD.partitions(RDD.scala:202)
>     at org.apache.spark.SparkContext.runJob(SparkContext.scala:1086)
>     at
> org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopDataset(PairRDDFunct
> ions.s
> cala:788)
>     at
> org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunction
> s.scal
> a:674)
>     at
> org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunction
> s.scal
> a:593)
>     at org.apache.spark.rdd.RDD.saveAsTextFile(RDD.scala:1068)
>     at
> org.apache.spark.api.java.JavaRDDLike$class.saveAsTextFile(JavaRDDLike
> .scala
> :436)
>     at 
> org.apache.spark.api.java.JavaRDD.saveAsTextFile(JavaRDD.scala:29)
>
> The partition count of the original rdd is 306.
>
> When the argument of coalesce() is one of 59, 60, 61, 62, 63, the 
> exception above is thrown.
>
> But the argument is one of 50, 55, 58, 64, 65, 80, 100, the exception 
> is not thrown. (I've not tried other values, I think that they will be 
> ok.)
>
> Is there any magic number for the argument of coalesce() ?
>
> Thanks.
>
>

RE: Strange exception on coalesce()

Reply via email to