Hi there,
I want to achieve the following usecase: Start Zeppelin 0.9.0 (in docker) on my
local dev machine but let the Spark jobs in the notebook run on a remote
cluster via YARN.
For a few hours already, I try to setup that environment with my companies
Cloudera CDH 6.3.1 development cluster. That cluster is unsecured (despite that
it can only be reached when connected to VPN). With a lot of trial and error I
finally achieved a successful connection from my dockerized Zeppelin to the
cluster. This means that when I start running a spark cell in Zeppelin, I can
see a new application in YARN on the cluster-side [named spark-shared_process]
. However, eventually the execution of the cell will fail with the following
stack trace in the yarn application [1]. I have no idea where this timeout
could potentially come from and I'd be happy if you could help me out here. In
the said VPN to the dev cluster, there are no connection restrictions like
firewalls or stuff like that engaged. The cell I run is the first one in "3.
Spark SQL (Scala)" Zeppelin quick start notebooks with title "Create
Dataset/DataFrame via SparkSession".
For reference, I also attach my docker-compose file [2] and my Dockerfile for
building Zeppelin with Spark and Hadoop [3] (Note that I add hadoop conf files
into the image because I'd like to distribute the image as ready-to-run for the
other people in my project without needing them to copy over the hadoop conf
files). After start of the container, I further change the interpreter settings
by setting yarn-cluster in %spark interpreter settings and also set
zeppelin.interpreter.connect.timeout to 600.000.
Best regards
Theo
PS: HDFS in general seems to work well. [4]
PPS: I also attach the docker container logs from an attempt [5]
[1]
INFO [2021-04-01 23:48:20,984] ({main} Logging.scala[logInfo]:54) - Registered
signal handler for TERM
INFO [2021-04-01 23:48:21,005] ({main} Logging.scala[logInfo]:54) - Registered
signal handler for HUP
INFO [2021-04-01 23:48:21,014] ({main} Logging.scala[logInfo]:54) - Registered
signal handler for INT
INFO [2021-04-01 23:48:22,158] ({main} Logging.scala[logInfo]:54) - Changing
view acls to: yarn,sandbox
INFO [2021-04-01 23:48:22,160] ({main} Logging.scala[logInfo]:54) - Changing
modify acls to: yarn,sandbox
INFO [2021-04-01 23:48:22,161] ({main} Logging.scala[logInfo]:54) - Changing
view acls groups to:
INFO [2021-04-01 23:48:22,162] ({main} Logging.scala[logInfo]:54) - Changing
modify acls groups to:
INFO [2021-04-01 23:48:22,168] ({main} Logging.scala[logInfo]:54) -
SecurityManager: authentication disabled; ui acls disabled; users with view
permissions: Set(yarn, sandbox); groups with view permissions: Set(); users
with modify permissions: Set(yarn, sandbox); groups with modify permissions:
Set()
INFO [2021-04-01 23:48:25,388] ({main} Logging.scala[logInfo]:54) - Preparing
Local resources
WARN [2021-04-01 23:48:28,111] ({main} NativeCodeLoader.java[<clinit>]:62) -
Unable to load native-hadoop library for your platform... using builtin-java
classes where applicable
INFO [2021-04-01 23:48:29,004] ({main} Logging.scala[logInfo]:54) -
ApplicationAttemptId: appattempt_1617228950227_5781_000001
INFO [2021-04-01 23:48:29,041] ({main} Logging.scala[logInfo]:54) - Starting
the user application in a separate Thread
INFO [2021-04-01 23:48:29,289] ({main} Logging.scala[logInfo]:54) - Waiting for
spark context initialization...
INFO [2021-04-01 23:48:30,007] ({RegisterThread}
RemoteInterpreterServer.java[run]:595) - Start registration
INFO [2021-04-01 23:48:30,009] ({RemoteInterpreterServer-Thread}
RemoteInterpreterServer.java[run]:193) - Launching ThriftServer at
99.99.99.99:44802
INFO [2021-04-01 23:48:31,276] ({RegisterThread}
RemoteInterpreterServer.java[run]:609) - Registering interpreter process
ERROR [2021-04-01 23:50:09,531] ({main} Logging.scala[logError]:91) - Uncaught
exception:
java.util.concurrent.TimeoutException: Futures timed out after [100000
milliseconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:223)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:227)
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:220)
at
org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:469)
at
org.apache.spark.deploy.yarn.ApplicationMaster.org$apache$spark$deploy$yarn$ApplicationMaster$$runImpl(ApplicationMaster.scala:305)
at
org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$1.apply$mcV$sp(ApplicationMaster.scala:245)
at
org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$1.apply(ApplicationMaster.scala:245)
at
org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$1.apply(ApplicationMaster.scala:245)
at
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:780)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at
org.apache.spark.deploy.yarn.ApplicationMaster.doAsUser(ApplicationMaster.scala:779)
at
org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:244)
at
org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:804)
at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
INFO [2021-04-01 23:50:09,547] ({main} Logging.scala[logInfo]:54) - Final app
status: FAILED, exitCode: 13, (reason: Uncaught exception:
java.util.concurrent.TimeoutException: Futures timed out after [100000
milliseconds]
[2]
version: '3.7'
services:
zeppelin:
build: zeppelin-customized
ports:
- "9999:8080"
environment:
ZEPPELIN_PORT: 8080
ZEPPELIN_JAVA_OPTS: >-
-Dspark.driver.memory=1g
-Dspark.executor.memory=2g
HADOOP_USER_NAME: sandbox
volumes:
- zeppelindata:/zeppelin/data
- zeppelinnotebooks:/zeppelin/notebook
volumes:
zeppelindata:
zeppelinnotebooks:
[3]
FROM apache/zeppelin:0.9.0
# default user is 1000 in zeppelin base..
USER root
RUN mkdir /spark && chown 1000:1000 /spark && mkdir /hadoop && chown 1000:1000
/hadoop
USER 1000
# Add Spark
RUN cd /spark \
&& wget
https://artfiles.org/apache.org/spark/spark-2.4.7/spark-2.4.7-bin-hadoop2.7.tgz
\
&& tar xf spark-2.4.7-bin-hadoop2.7.tgz \
&& rm spark-2.4.7-bin-hadoop2.7.tgz \
&& cd ~
ENV SPARK_HOME /spark/spark-2.4.7-bin-hadoop2.7
ENV HADOOP_CONF_DIR /zeppelin/conf
# Add Hadoop
RUN cd /hadoop \
&& wget
https://archive.apache.org/dist/hadoop/common/hadoop-3.0.0/hadoop-3.0.0.tar.gz
\
&& tar xf hadoop-3.0.0.tar.gz \
&& rm hadoop-3.0.0.tar.gz \
&& cd ~
ENV HADOOP_HOME /hadoop/hadoop-3.0.0
ENV HADOOP_INSTALL=$HADOOP_HOME
ENV HADOOP_MAPRED_HOME=$HADOOP_HOME
ENV HADOOP_COMMON_HOME=$HADOOP_HOME
ENV HADOOP_HDFS_HOME=$HADOOP_HOME
ENV YARN_HOME=$HADOOP_HOME
ENV HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
ENV HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/nativ"
ENV PATH="${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:${PATH}"
ENV USE_HADOOP=true
# Copy over /etc/hadoop/conf from one of the cluster nodes...
COPY cloudernode/conf/ /zeppelin/conf/
[4]
%sh
hdfs dfs -ls /user/sandbox
=> prints out properly.
[5]
zeppelin_1 | WARN [2021-04-01 23:18:36,440] ({SchedulerFactory4}
SparkInterpreterLauncher.java[buildEnvFromProperties]:221) -
spark-defaults.conf doesn't exist:
/spark/spark-2.4.7-bin-hadoop2.7/conf/spark-defaults.conf
zeppelin_1 | INFO [2021-04-01 23:18:36,440] ({SchedulerFactory4}
SparkInterpreterLauncher.java[buildEnvFromProperties]:224) -
buildEnvFromProperties:
{PATH=/hadoop/hadoop-3.0.0/bin:/hadoop/hadoop-3.0.0/sbin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin,
ZEPPELIN_PORT=8080, HADOOP_CONF_DIR=/zeppelin/conf,
ZEPPELIN_JAVA_OPTS=-Dspark.driver.memory=1g -Dspark.executor.memory=2g,
ZEPPELIN_LOG_DIR=/opt/zeppelin/logs, MASTER=yarn,
ZEPPELIN_WAR=/opt/zeppelin/zeppelin-web-0.9.0.war, ZEPPELIN_ENCODING=UTF-8,
ZEPPELIN_SPARK_CONF= --conf
spark.yarn.dist.archives=/spark/spark-2.4.7-bin-hadoop2.7/R/lib/sparkr.zip#sparkr
--conf spark.yarn.isPython=true --conf spark.executor.instances=2 --conf
spark.app.name=spark-shared_process --conf spark.webui.yarn.useProxy=false
--conf spark.driver.cores=1 --conf spark.yarn.maxAppAttempts=1 --conf
spark.executor.memory=2g --conf spark.master=yarn-cluster --conf
spark.files=/opt/zeppelin/conf/log4j_yarn_cluster.properties --conf
spark.driver.memory=1g --conf
spark.jars=/opt/zeppelin/interpreter/spark/scala-2.11/spark-scala-2.11-0.9.0.jar,/opt/zeppelin/interpreter/zeppelin-interpreter-shaded-0.9.0.jar
--conf spark.executor.cores=1 --conf
spark.yarn.submit.waitAppCompletion=false,
JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64, JAVA_OPTS=
-Dspark.driver.memory=1g -Dspark.executor.memory=2g -Dfile.encoding=UTF-8
-Xms1024m -Xmx1024m
-Dlog4j.configuration=file:///opt/zeppelin/conf/log4j.properties
-Dzeppelin.log.file=/opt/zeppelin/logs/zeppelin--d5ea32f1f431.log,
INTERPRETER_GROUP_ID=spark-shared_process, Z_VERSION=0.9.0, LANG=en_US.UTF-8,
JAVA_INTP_OPTS= -Dfile.encoding=UTF-8
-Dlog4j.configuration=file:///opt/zeppelin/conf/log4j.properties
-Dlog4j.configurationFile=file:///opt/zeppelin/conf/log4j2.properties,
PYSPARK_PYTHON=python, HADOOP_USER_NAME=sandbox,
ZEPPELIN_SPARK_YARN_CLUSTER=true, Z_HOME=/opt/zeppelin,
SPARK_HOME=/spark/spark-2.4.7-bin-hadoop2.7,
ZEPPELIN_CONF_DIR=/opt/zeppelin/conf, YARN_HOME=/hadoop/hadoop-3.0.0,
HADOOP_HDFS_HOME=/hadoop/hadoop-3.0.0,
ZEPPELIN_RUNNER=/usr/lib/jvm/java-8-openjdk-amd64/bin/java,
HADOOP_MAPRED_HOME=/hadoop/hadoop-3.0.0, PWD=/opt/zeppelin,
HADOOP_COMMON_HOME=/hadoop/hadoop-3.0.0, HADOOP_INSTALL=/hadoop/hadoop-3.0.0,
ZEPPELIN_HOME=/opt/zeppelin, LOG_TAG=[ZEPPELIN_0.9.0]:,
ZEPPELIN_INTP_MEM=-Xms1024m -Xmx2048m,
HADOOP_OPTS=-Djava.library.path=/hadoop/hadoop-3.0.0/lib/nativ,
PYSPARK_DRIVER_PYTHON=python, ZEPPELIN_PID_DIR=/opt/zeppelin/run,
ZEPPELIN_ANGULAR_WAR=/opt/zeppelin/zeppelin-web-angular-0.9.0.war,
ZEPPELIN_MEM=-Xms1024m -Xmx1024m, HOSTNAME=d5ea32f1f431, LC_ALL=en_US.UTF-8,
ZEPPELIN_IDENT_STRING=, PYSPARK_PIN_THREAD=true,
HADOOP_HOME=/hadoop/hadoop-3.0.0, USE_HADOOP=true,
HADOOP_COMMON_LIB_NATIVE_DIR=/hadoop/hadoop-3.0.0/lib/native,
ZEPPELIN_ADDR=0.0.0.0, ZEPPELIN_INTERPRETER_REMOTE_RUNNER=bin/interpreter.sh,
SHLVL=0, HOME=/opt/zeppelin}
zeppelin_1 | INFO [2021-04-01 23:18:36,445] ({SchedulerFactory4}
ProcessLauncher.java[transition]:109) - Process state is transitioned to
LAUNCHED
zeppelin_1 | INFO [2021-04-01 23:18:36,446] ({SchedulerFactory4}
ProcessLauncher.java[launch]:96) - Process is launched:
[/opt/zeppelin/bin/interpreter.sh, -d, /opt/zeppelin/interpreter/spark, -c,
172.2.0.2, -p, 46781, -r, :, -i, spark-shared_process, -l,
/opt/zeppelin/local-repo/spark, -g, spark]
zeppelin_1 | WARN [2021-04-01 23:20:51,930] ({Exec Default Executor}
RemoteInterpreterManagedProcess.java[onProcessComplete]:255) - Process is
exited with exit value 0
zeppelin_1 | INFO [2021-04-01 23:20:51,933] ({Exec Default Executor}
ProcessLauncher.java[transition]:109) - Process state is transitioned to
COMPLETED
zeppelin_1 | INFO [2021-04-01 23:24:06,162] ({qtp418304857-11}
VFSNotebookRepo.java[save]:144) - Saving note 2EYUV26VR to Spark Tutorial/3.
Spark SQL (Scala)_2EYUV26VR.zpln
zeppelin_1 | INFO [2021-04-01 23:24:15,933] ({qtp418304857-27}
VFSNotebookRepo.java[save]:144) - Saving note 2EYUV26VR to Spark Tutorial/3.
Spark SQL (Scala)_2EYUV26VR.zpln
zeppelin_1 | WARN [2021-04-01 23:28:36,539] ({SchedulerFactory4}
NotebookServer.java[onStatusChange]:1928) - Job 20180530-101750_1491737301 is
finished, status: ERROR, exception: null, result: %text
org.apache.zeppelin.interpreter.InterpreterException: java.io.IOException: Fail
to launch interpreter process:
zeppelin_1 | Warning: Master yarn-cluster is deprecated since 2.0. Please use
master "yarn" with specified deploy mode instead.
zeppelin_1 | 21/04/01 23:18:44 WARN util.NativeCodeLoader: Unable to load
native-hadoop library for your platform... using builtin-java classes where
applicable
zeppelin_1 | 21/04/01 23:18:44 INFO client.RMProxy: Connecting to
ResourceManager at machine1.REMOVEDDOMAIN.de/99.99.99.99:8032
zeppelin_1 | 21/04/01 23:18:45 INFO yarn.Client: Requesting a new application
from cluster with 4 NodeManagers
zeppelin_1 | 21/04/01 23:18:45 INFO yarn.Client: Verifying our application has
not requested more than the maximum memory capability of the cluster (16400 MB
per container)
zeppelin_1 | 21/04/01 23:18:45 INFO yarn.Client: Will allocate AM container,
with 1408 MB memory including 384 MB overhead
zeppelin_1 | 21/04/01 23:18:45 INFO yarn.Client: Setting up container launch
context for our AM
zeppelin_1 | 21/04/01 23:18:45 INFO yarn.Client: Setting up the launch
environment for our AM container
zeppelin_1 | 21/04/01 23:18:45 INFO yarn.Client: Preparing resources for our AM
container
zeppelin_1 | 21/04/01 23:18:45 WARN yarn.Client: Neither spark.yarn.jars nor
spark.yarn.archive is set, falling back to uploading libraries under
SPARK_HOME.
zeppelin_1 | 21/04/01 23:18:53 INFO yarn.Client: Uploading resource
file:/tmp/spark-4c2bf1a1-2e67-42a9-8524-7810e1448440/__spark_libs__5266504625643101044.zip
->
hdfs://machine1.REMOVEDDOMAIN.de:8020/user/sandbox/.sparkStaging/application_1617315347811_0170/__spark_libs__5266504625643101044.zip
zeppelin_1 | 21/04/01 23:20:09 INFO yarn.Client: Uploading resource
file:/opt/zeppelin/interpreter/spark/spark-interpreter-0.9.0.jar ->
hdfs://machine1.REMOVEDDOMAIN.de:8020/user/sandbox/.sparkStaging/application_1617315347811_0170/spark-interpreter-0.9.0.jar
zeppelin_1 | 21/04/01 23:20:35 INFO yarn.Client: Uploading resource
file:/opt/zeppelin/interpreter/spark/scala-2.11/spark-scala-2.11-0.9.0.jar ->
hdfs://machine1.REMOVEDDOMAIN.de:8020/user/sandbox/.sparkStaging/application_1617315347811_0170/spark-scala-2.11-0.9.0.jar
zeppelin_1 | 21/04/01 23:20:35 INFO yarn.Client: Uploading resource
file:/opt/zeppelin/interpreter/zeppelin-interpreter-shaded-0.9.0.jar ->
hdfs://machine1.REMOVEDDOMAIN.de:8020/user/sandbox/.sparkStaging/application_1617315347811_0170/zeppelin-interpreter-shaded-0.9.0.jar
zeppelin_1 | 21/04/01 23:20:41 INFO yarn.Client: Uploading resource
file:/opt/zeppelin/conf/log4j_yarn_cluster.properties ->
hdfs://machine1.REMOVEDDOMAIN.de:8020/user/sandbox/.sparkStaging/application_1617315347811_0170/log4j_yarn_cluster.properties
zeppelin_1 | 21/04/01 23:20:42 INFO yarn.Client: Uploading resource
file:/spark/spark-2.4.7-bin-hadoop2.7/R/lib/sparkr.zip#sparkr ->
hdfs://machine1.REMOVEDDOMAIN.de:8020/user/sandbox/.sparkStaging/application_1617315347811_0170/sparkr.zip
zeppelin_1 | 21/04/01 23:20:43 INFO yarn.Client: Uploading resource
file:/spark/spark-2.4.7-bin-hadoop2.7/python/lib/pyspark.zip ->
hdfs://machine1.REMOVEDDOMAIN.de:8020/user/sandbox/.sparkStaging/application_1617315347811_0170/pyspark.zip
zeppelin_1 | 21/04/01 23:20:44 INFO yarn.Client: Uploading resource
file:/spark/spark-2.4.7-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip ->
hdfs://machine1.REMOVEDDOMAIN.de:8020/user/sandbox/.sparkStaging/application_1617315347811_0170/py4j-0.10.7-src.zip
zeppelin_1 | 21/04/01 23:20:45 INFO yarn.Client: Uploading resource
file:/tmp/spark-4c2bf1a1-2e67-42a9-8524-7810e1448440/__spark_conf__8289533000141907930.zip
->
hdfs://machine1.REMOVEDDOMAIN.de:8020/user/sandbox/.sparkStaging/application_1617315347811_0170/__spark_conf__.zip
zeppelin_1 | 21/04/01 23:20:46 INFO spark.SecurityManager: Changing view acls
to: zeppelin,sandbox
zeppelin_1 | 21/04/01 23:20:46 INFO spark.SecurityManager: Changing modify acls
to: zeppelin,sandbox
zeppelin_1 | 21/04/01 23:20:46 INFO spark.SecurityManager: Changing view acls
groups to:
zeppelin_1 | 21/04/01 23:20:46 INFO spark.SecurityManager: Changing modify acls
groups to:
zeppelin_1 | 21/04/01 23:20:46 INFO spark.SecurityManager: SecurityManager:
authentication disabled; ui acls disabled; users with view permissions:
Set(zeppelin, sandbox); groups with view permissions: Set(); users with modify
permissions: Set(zeppelin, sandbox); groups with modify permissions: Set()
zeppelin_1 | 21/04/01 23:20:51 INFO yarn.Client: Submitting application
application_1617315347811_0170 to ResourceManager
zeppelin_1 | 21/04/01 23:20:51 INFO impl.YarnClientImpl: Submitted application
application_1617315347811_0170
zeppelin_1 | 21/04/01 23:20:51 INFO yarn.Client: Application report for
application_1617315347811_0170 (state: ACCEPTED)
zeppelin_1 | 21/04/01 23:20:51 INFO yarn.Client:
zeppelin_1 | client token: N/A
zeppelin_1 | diagnostics: N/A
zeppelin_1 | ApplicationMaster host: N/A
zeppelin_1 | ApplicationMaster RPC port: -1
zeppelin_1 | queue: root.users.sandbox
zeppelin_1 | start time: 1617319251597
zeppelin_1 | final status: UNDEFINED
zeppelin_1 | tracking URL:
http://machine1.REMOVEDDOMAIN.de:8088/proxy/application_1617315347811_0170/
zeppelin_1 | user: sandbox
zeppelin_1 | 21/04/01 23:20:51 INFO util.ShutdownHookManager: Shutdown hook
called
zeppelin_1 | 21/04/01 23:20:51 INFO util.ShutdownHookManager: Deleting
directory /tmp/spark-1d86bc2c-eade-48f5-9650-423eef0fbda2
zeppelin_1 | 21/04/01 23:20:51 INFO util.ShutdownHookManager: Deleting
directory /tmp/spark-4c2bf1a1-2e67-42a9-8524-7810e1448440
zeppelin_1 |
zeppelin_1 | at
org.apache.zeppelin.interpreter.remote.RemoteInterpreter.open(RemoteInterpreter.java:129)
zeppelin_1 | at
org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:271)
zeppelin_1 | at
org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:444)
zeppelin_1 | at
org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:72)
zeppelin_1 | at org.apache.zeppelin.scheduler.Job.run(Job.java:172)
zeppelin_1 | at
org.apache.zeppelin.scheduler.AbstractScheduler.runJob(AbstractScheduler.java:132)
zeppelin_1 | at
org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:182)
zeppelin_1 | at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
zeppelin_1 | at java.util.concurrent.FutureTask.run(FutureTask.java:266)
zeppelin_1 | at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
zeppelin_1 | at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
zeppelin_1 | at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
zeppelin_1 | at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
zeppelin_1 | at java.lang.Thread.run(Thread.java:748)
zeppelin_1 | Caused by: java.io.IOException: Fail to launch interpreter
process:
zeppelin_1 | Warning: Master yarn-cluster is deprecated since 2.0. Please use
master "yarn" with specified deploy mode instead.
zeppelin_1 | 21/04/01 23:18:44 WARN util.NativeCodeLoader: Unable to load
native-hadoop library for your platform... using builtin-java classes where
applicable
zeppelin_1 | 21/04/01 23:18:44 INFO client.RMProxy: Connecting to
ResourceManager at machine1.REMOVEDDOMAIN.de/99.99.99.99:8032
zeppelin_1 | 21/04/01 23:18:45 INFO yarn.Client: Requesting a new application
from cluster with 4 NodeManagers
zeppelin_1 | 21/04/01 23:18:45 INFO yarn.Client: Verifying our application has
not requested more than the maximum memory capability of the cluster (16400 MB
per container)
zeppelin_1 | 21/04/01 23:18:45 INFO yarn.Client: Will allocate AM container,
with 1408 MB memory including 384 MB overhead
zeppelin_1 | 21/04/01 23:18:45 INFO yarn.Client: Setting up container launch
context for our AM
zeppelin_1 | 21/04/01 23:18:45 INFO yarn.Client: Setting up the launch
environment for our AM container
zeppelin_1 | 21/04/01 23:18:45 INFO yarn.Client: Preparing resources for our AM
container
zeppelin_1 | 21/04/01 23:18:45 WARN yarn.Client: Neither spark.yarn.jars nor
spark.yarn.archive is set, falling back to uploading libraries under
SPARK_HOME.
zeppelin_1 | 21/04/01 23:18:53 INFO yarn.Client: Uploading resource
file:/tmp/spark-4c2bf1a1-2e67-42a9-8524-7810e1448440/__spark_libs__5266504625643101044.zip
->
hdfs://machine1.REMOVEDDOMAIN.de:8020/user/sandbox/.sparkStaging/application_1617315347811_0170/__spark_libs__5266504625643101044.zip
zeppelin_1 | 21/04/01 23:20:09 INFO yarn.Client: Uploading resource
file:/opt/zeppelin/interpreter/spark/spark-interpreter-0.9.0.jar ->
hdfs://machine1.REMOVEDDOMAIN.de:8020/user/sandbox/.sparkStaging/application_1617315347811_0170/spark-interpreter-0.9.0.jar
zeppelin_1 | 21/04/01 23:20:35 INFO yarn.Client: Uploading resource
file:/opt/zeppelin/interpreter/spark/scala-2.11/spark-scala-2.11-0.9.0.jar ->
hdfs://machine1.REMOVEDDOMAIN.de:8020/user/sandbox/.sparkStaging/application_1617315347811_0170/spark-scala-2.11-0.9.0.jar
zeppelin_1 | 21/04/01 23:20:35 INFO yarn.Client: Uploading resource
file:/opt/zeppelin/interpreter/zeppelin-interpreter-shaded-0.9.0.jar ->
hdfs://machine1.REMOVEDDOMAIN.de:8020/user/sandbox/.sparkStaging/application_1617315347811_0170/zeppelin-interpreter-shaded-0.9.0.jar
zeppelin_1 | 21/04/01 23:20:41 INFO yarn.Client: Uploading resource
file:/opt/zeppelin/conf/log4j_yarn_cluster.properties ->
hdfs://machine1.REMOVEDDOMAIN.de:8020/user/sandbox/.sparkStaging/application_1617315347811_0170/log4j_yarn_cluster.properties
zeppelin_1 | 21/04/01 23:20:42 INFO yarn.Client: Uploading resource
file:/spark/spark-2.4.7-bin-hadoop2.7/R/lib/sparkr.zip#sparkr ->
hdfs://machine1.REMOVEDDOMAIN.de:8020/user/sandbox/.sparkStaging/application_1617315347811_0170/sparkr.zip
zeppelin_1 | 21/04/01 23:20:43 INFO yarn.Client: Uploading resource
file:/spark/spark-2.4.7-bin-hadoop2.7/python/lib/pyspark.zip ->
hdfs://machine1.REMOVEDDOMAIN.de:8020/user/sandbox/.sparkStaging/application_1617315347811_0170/pyspark.zip
zeppelin_1 | 21/04/01 23:20:44 INFO yarn.Client: Uploading resource
file:/spark/spark-2.4.7-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip ->
hdfs://machine1.REMOVEDDOMAIN.de:8020/user/sandbox/.sparkStaging/application_1617315347811_0170/py4j-0.10.7-src.zip
zeppelin_1 | 21/04/01 23:20:45 INFO yarn.Client: Uploading resource
file:/tmp/spark-4c2bf1a1-2e67-42a9-8524-7810e1448440/__spark_conf__8289533000141907930.zip
->
hdfs://machine1.REMOVEDDOMAIN.de:8020/user/sandbox/.sparkStaging/application_1617315347811_0170/__spark_conf__.zip
zeppelin_1 | 21/04/01 23:20:46 INFO spark.SecurityManager: Changing view acls
to: zeppelin,sandbox
zeppelin_1 | 21/04/01 23:20:46 INFO spark.SecurityManager: Changing modify acls
to: zeppelin,sandbox
zeppelin_1 | 21/04/01 23:20:46 INFO spark.SecurityManager: Changing view acls
groups to:
zeppelin_1 | 21/04/01 23:20:46 INFO spark.SecurityManager: Changing modify acls
groups to:
zeppelin_1 | 21/04/01 23:20:46 INFO spark.SecurityManager: SecurityManager:
authentication disabled; ui acls disabled; users with view permissions:
Set(zeppelin, sandbox); groups with view permissions: Set(); users with modify
permissions: Set(zeppelin, sandbox); groups with modify permissions: Set()
zeppelin_1 | 21/04/01 23:20:51 INFO yarn.Client: Submitting application
application_1617315347811_0170 to ResourceManager
zeppelin_1 | 21/04/01 23:20:51 INFO impl.YarnClientImpl: Submitted application
application_1617315347811_0170
zeppelin_1 | 21/04/01 23:20:51 INFO yarn.Client: Application report for
application_1617315347811_0170 (state: ACCEPTED)
zeppelin_1 | 21/04/01 23:20:51 INFO yarn.Client:
zeppelin_1 | client token: N/A
zeppelin_1 | diagnostics: N/A
zeppelin_1 | ApplicationMaster host: N/A
zeppelin_1 | ApplicationMaster RPC port: -1
zeppelin_1 | queue: root.users.sandbox
zeppelin_1 | start time: 1617319251597
zeppelin_1 | final status: UNDEFINED
zeppelin_1 | tracking URL:
http://machine1.REMOVEDDOMAIN.de:8088/proxy/application_1617315347811_0170/
zeppelin_1 | user: sandbox
zeppelin_1 | 21/04/01 23:20:51 INFO util.ShutdownHookManager: Shutdown hook
called
zeppelin_1 | 21/04/01 23:20:51 INFO util.ShutdownHookManager: Deleting
directory /tmp/spark-4c2bf1a1-2e67-42a9-8524-7810e1448440
zeppelin_1 | 21/04/01 23:20:51 INFO util.ShutdownHookManager: Deleting
directory /tmp/spark-1d86bc2c-eade-48f5-9650-423eef0fbda2
zeppelin_1 |
zeppelin_1 | at
org.apache.zeppelin.interpreter.remote.RemoteInterpreterManagedProcess.start(RemoteInterpreterManagedProcess.java:126)
zeppelin_1 | at
org.apache.zeppelin.interpreter.ManagedInterpreterGroup.getOrCreateInterpreterProcess(ManagedInterpreterGroup.java:68)
zeppelin_1 | at
org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getOrCreateInterpreterProcess(RemoteInterpreter.java:104)
zeppelin_1 | at
org.apache.zeppelin.interpreter.remote.RemoteInterpreter.internal_create(RemoteInterpreter.java:154)
zeppelin_1 | at
org.apache.zeppelin.interpreter.remote.RemoteInterpreter.open(RemoteInterpreter.java:126)
zeppelin_1 | ... 13 more
zeppelin_1 |
zeppelin_1 | INFO [2021-04-01 23:28:36,542] ({SchedulerFactory4}
VFSNotebookRepo.java[save]:144) - Saving note 2EYUV26VR to Spark Tutorial/3.
Spark SQL (Scala)_2EYUV26VR.zpln