Hi, I met this exception when computing new RDD from an existing RDD or using .count on some RDDs. The following is the situation:
val DD1=D.map(d => { (d._1,D.map(x => math.sqrt(x._2*d._2)).toArray) }) DD is in the format RDD[(Int,Double)] and the error message is: org.apache.spark.SparkException: Job aborted: Task 14.0:8 failed more than 0 times; aborting job java.lang.NullPointerException at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:827) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:825) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:60) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:825) at org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:440) at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$run(DAGScheduler.scala:502) at org.apache.spark.scheduler.DAGScheduler$$anon$1.run(DAGScheduler.scala:157) I also met this kind of problem when using .count() on some RDDs. Thanks a lot! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/java-lang-NullPointerException-met-when-computing-new-RDD-or-use-count-tp2766.html Sent from the Apache Spark User List mailing list archive at Nabble.com.