Looking at the source codes of DStream.scala

>   /**
>    * Return a new DStream in which each RDD has a single element generated
> by counting each RDD
>    * of this DStream.
>    */
>   def count(): DStream[Long] = {
>     this.map(_ => (null, 1L))
>         .transform(_.union(context.sparkContext.makeRDD(Seq((null, 0L)),
> 1)))
>         .reduceByKey(_ + _)
>         .map(_._2)
>   }

transform is the line throwing the NullPointerException. Can anyone give
some hints as what would cause "_" to be null (it is indeed null)? This only
happens when there is no data to process.

When there's data, no NullPointerException is thrown, and all the
processing/counting proceeds correctly.

I am using my custom InputDStream. Could it be that this is the source of
the problem?



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/NullPointerException-from-count-foreachRDD-tp2066p12461.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to