That doesn't seem to be the problem though. It processes but then stops. Presumably there are many executors. On Nov 22, 2014 9:40 AM, "Akhil Das" <ak...@sigmoidanalytics.com> wrote:
> For Spark streaming, you must always set *--executor-cores* to a value > which is >= 2. Or else it will not do any processing. > > Thanks > Best Regards > > On Sat, Nov 22, 2014 at 8:39 AM, pankaj channe <pankajc...@gmail.com> > wrote: > >> I have seen similar posts on this issue but could not find solution. >> Apologies if this has been discussed here before. >> >> I am running a spark streaming job with yarn on a 5 node cluster. I am >> using following command to submit my streaming job. >> >> spark-submit --class class_name --master yarn-cluster --num-executors 1 >> --driver-memory 1g --executor-memory 1g --executor-cores 1 my_app.jar >> >> >> After running for some time, the job stops. The application log shows >> following two errors: >> >> 14/11/21 22:05:04 WARN yarn.ApplicationMaster: Unable to retrieve >> SparkContext in spite of waiting for 100000, maxNumTries = 10 >> Exception in thread "main" java.lang.NullPointerException >> at >> org.apache.spark.deploy.yarn.ApplicationMaster.waitForSparkContextInitialized(ApplicationMaster.scala:218) >> at >> org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:107) >> at >> org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:410) >> at >> org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:53) >> at >> org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:52) >> at java.security.AccessController.doPrivileged(Native Method) >> at javax.security.auth.Subject.doAs(Subject.java:415) >> at >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594) >> at >> org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:52) >> at >> org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:409) >> at >> org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala) >> >> >> and later... >> >> Failed to list files for dir: >> /data2/hadoop/yarn/local/usercache/user_name/appcache/application_1416332002106_0009/spark-local-20141121220325-b529/20 >> at org.apache.spark.util.Utils$.listFilesSafely(Utils.scala:673) >> at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:685) >> at >> org.apache.spark.util.Utils$$anonfun$deleteRecursively$1.apply(Utils.scala:686) >> at >> org.apache.spark.util.Utils$$anonfun$deleteRecursively$1.apply(Utils.scala:685) >> at >> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) >> at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34) >> at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:685) >> at >> org.apache.spark.storage.DiskBlockManager$$anonfun$stop$1.apply(DiskBlockManager.scala:181) >> at >> org.apache.spark.storage.DiskBlockManager$$anonfun$stop$1.apply(DiskBlockManager.scala:178) >> at >> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) >> at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108) >> at >> org.apache.spark.storage.DiskBlockManager.stop(DiskBlockManager.scala:178) >> at >> org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply$mcV$sp(DiskBlockManager.scala:171) >> at >> org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply(DiskBlockManager.scala:169) >> at >> org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply(DiskBlockManager.scala:169) >> at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1311) >> at >> org.apache.spark.storage.DiskBlockManager$$anon$1.run(DiskBlockManager.scala:169) >> >> >> Note: I am building my jar on my local with spark dependency added in >> pom.xml and running it on cluster running spark. >> >> >> -Pankaj >> > >