---------------------------- Original Message ---------------------------- Subject: matchError:null in ALS.train From: "Honey Joshi" <honeyjo...@ideata-analytics.com> Date: Thu, July 3, 2014 8:12 am To: user@spark.apache.org --------------------------------------------------------------------------
Hi All, We are using ALS.train to generate a model for predictions. We are using DStream[] to collect the predicted output and then trying to dump in a text file using these two approaches dstream.saveAsTextFiles() and dstream.foreachRDD(rdd=>rdd.saveAsTextFile).But both these approaches are giving us the following error : Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 1.0:0 failed 1 times, most recent failure: Exception failure in TID 0 on host localhost: scala.MatchError: null org.apache.spark.rdd.PairRDDFunctions.lookup(PairRDDFunctions.scala:571) org.apache.spark.mllib.recommendation.MatrixFactorizationModel.predict(MatrixFactorizationModel.scala:43) MyOperator$$anonfun$7.apply(MyOperator.scala:213) MyOperator$$anonfun$7.apply(MyOperator.scala:180) scala.collection.Iterator$$anon$11.next(Ite rator.scala:328) scala.collection.Iterator$class.foreach(Iterator.scala:727) scala.collection.AbstractIterator.foreach(Iterator.scala:1157) scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48) scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103) org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:107) org.apache.spark.rdd.RDD.iterator(RDD.scala:227) org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31) org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) org.apache.spark.rdd.RDD.iterator(RDD.scala:229) org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111) org.apache.spark.scheduler.Task.run(Task.scala:51) org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) java.lang.Thread.run(Thread.java:744) We tried it in both spark 0.9.1 as well as 1.0.0 ;scala:2.10.3. Can anybody help me with the issue. Thank You Regards Honey Joshi Ideata-Analytics