Hi there, I am trying to run the example code pi.py on a cluster, however, I only got it working on localhost. When trying to run in standalone mode,
./bin/spark-submit \ --master spark://[mymaster]:7077 \ examples/src/main/python/pi.py \ I get warnings about resources and memory (the workstation actually has 192GByte Memory and 32 cores). 14/11/01 21:37:05 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 14/11/01 21:37:05 INFO client.AppClient$ClientActor: Executor updated: app-20141101213420-0000/4 is now EXITED (Command exited with code 1) 14/11/01 21:37:05 INFO cluster.SparkDeploySchedulerBackend: Executor app-20141101213420-0000/4 removed: Command exited with code 1 14/11/01 21:37:05 INFO client.AppClient$ClientActor: Executor added: app-20141101213420-0000/5 on worker-20141101213345-localhost-33525 (localhost:33525) with 32 cores 14/11/01 21:37:05 INFO cluster.SparkDeploySchedulerBackend: Granted executor ID app-20141101213420-0000/5 on hostPort localhost:33525 with 32 cores, 1024.0 MB RAM 14/11/01 21:37:05 INFO client.AppClient$ClientActor: Executor updated: app-20141101213420-0000/5 is now RUNNING 14/11/01 21:37:20 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 14/11/01 21:37:35 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 14/11/01 21:37:38 INFO client.AppClient$ClientActor: Executor updated: app-20141101213420-0000/5 is now EXITED (Command exited with code 1) 14/11/01 21:37:38 INFO cluster.SparkDeploySchedulerBackend: Executor app-20141101213420-0000/5 removed: Command exited with code 1 14/11/01 21:37:38 INFO client.AppClient$ClientActor: Executor added: app-20141101213420-0000/6 on worker-20141101213345-localhost-33525 (localhost:33525) with 32 cores 14/11/01 21:37:38 INFO cluster.SparkDeploySchedulerBackend: Granted executor ID app-20141101213420-0000/6 on hostPort localhost:33525 with 32 cores, 1024.0 MB RAM 14/11/01 21:37:38 INFO client.AppClient$ClientActor: Executor updated: app-20141101213420-0000/6 is now RUNNING 14/11/01 21:37:50 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 14/11/01 21:38:05 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 14/11/01 21:38:11 INFO client.AppClient$ClientActor: Executor updated: app-20141101213420-0000/6 is now EXITED (Command exited with code 1) 14/11/01 21:38:11 INFO cluster.SparkDeploySchedulerBackend: Executor app-20141101213420-0000/6 removed: Command exited with code 1 14/11/01 21:38:11 INFO client.AppClient$ClientActor: Executor added: app-20141101213420-0000/7 on worker-20141101213345-localhost-33525 (localhost:33525) with 32 cores 14/11/01 21:38:11 INFO cluster.SparkDeploySchedulerBackend: Granted executor ID app-20141101213420-0000/7 on hostPort localhost:33525 with 32 cores, 1024.0 MB RAM 14/11/01 21:38:11 INFO client.AppClient$ClientActor: Executor updated: app-20141101213420-0000/7 is now RUNNING [..] The worker is connected successfully to the master and tries to run the code: 14/11/01 21:39:17 INFO worker.Worker: Asked to launch executor app-20141101213420-0000/9 for PythonPi 14/11/01 21:39:17 WARN worker.CommandUtils: SPARK_JAVA_OPTS was set on the worker. It is deprecated in Spark 1.0. 14/11/01 21:39:17 WARN worker.CommandUtils: Set SPARK_LOCAL_DIRS for node-specific storage locations. 14/11/01 21:39:17 INFO worker.ExecutorRunner: Launch command: "/usr/lib/jvm/java-6-openjdk-amd64/jre/bin/java" "-cp" "::/etc/hadoop/spark/spark-1.1.0/conf:/etc/hadoop/spark/spark-1.1.0/assembly/target/scala-2.10/spark-assembly-1.1.0-hadoop2.5.1.jar:/etc/hadoop/conf" "-XX:MaxPermSize=128m" "-verbose:gc" "-XX:+PrintGCDetails" "-XX:+PrintGCTimeStamps" "-Dspark.akka.frameSize=32" "-Dspark.driver.port=47509" "-verbose:gc" "-XX:+PrintGCDetails" "-XX:+PrintGCTimeStamps" "-Xms1024M" "-Xmx1024M" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "akka.tcp://sparkDriver@localhost:47509/user/CoarseGrainedScheduler" "9" "localhost" "32" "akka.tcp://sparkWorker@localhost:33525/user/Worker" "app-20141101213420-0000" 14/11/01 21:39:50 INFO worker.Worker: Executor app-20141101213420-0000/9 finished with state EXITED message Command exited with code 1 exitStatus 1 Looking at the working thread log file in /spark-1.1.0/work/app-20141101213420-0000/[..]/stderr 14/11/01 21:38:46 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://driverPropsFetcher@localhost:52163] 14/11/01 21:38:46 INFO Remoting: Remoting now listens on addresses: [akka.tcp://driverPropsFetcher@localhost:52163] 14/11/01 21:38:46 INFO util.Utils: Successfully started service 'driverPropsFetcher' on port 52163. Exception in thread "main" java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1629) at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:52) at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:113) at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:156) at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala) Caused by: java.util.concurrent.TimeoutException: Futures timed out after [30 seconds] at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219) at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223) at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107) at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53) at scala.concurrent.Await$.result(package.scala:107) at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:125) at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:53) at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:52) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) ... 4 more Does anybody have an idea how to resolve that? I am puzzled. The WebUI are up and running and seem reasonable. Best, Tassilo -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/org-apache-hadoop-security-UserGroupInformation-doAs-Issue-tp17897.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org