Hi all,
I am facing a strange issue on two different machines that acts like servers. 
Each of them runs an instance of Zeppelin installed as a system.d service.
The configuration is:
 - Ubuntu Server 16.04.2 LTS
 - Spark 2.1.0
 - Microsoft Open R 3.3.2
 - Zeppelin 0.7.1 (0.7.0 gave the same problems)

zeppelin-env.sh has the following settings:
export SPARK_HOME="/spark/home/directory"

spark-env.sh has the following settings:
export LANG="en_US"
export SPARK_DAEMON_JAVA_OPTS+=" -Dspark.local.dir=/some/dir 
-Dspark.eventLog.dir=/some/dir/spark-events -Dhadoop.tmp.dir=/some/dir"
export _JAVA_OPTIONS+=" -Djava.io.tmpdir=/some/dir"

spark-defaults.conf is set as:
spark.executor.memory                   21g
spark.driver.memory                     21g
spark.python.worker.memory              4g
spark.sql.autoBroadcastJoinThreshold    0

I use Spark in stand-alone mode and it works perfectly. It also works correctly 
with Zeppelin but this is what happens:
1) Start zeppelin on the server using the command service zeppelin start
2) Connect to port 8080 using Mozilla Firefox from client 
3) Insert username and password (I enabled Shiro authentication)
4) open a notebook
5) Execute the following code:
%spark.r
2+2
6) The code runs correctly and I can see that R is currently running as a 
process.
7) Repeat steps 2-5 after some time (let’s say 2 or 3 hours) and Zeppelin 
remains forever on “Running” or, if the elapsed time is higher (for example 1 
day) since the last run, it returns “Error”. The “time-to-be-unresponsive” 
seems to be random and unpredictable. Also, R is not present in the list of 
running processes. Spark session remains active because I can access Spark UI 
from port 4040 and the application name is “Zeppelin”, so it’s the Spark 
instance created by Zeppelin.

I observed that sometimes I can simply restart the interpreter from Zeppelin 
UI, but many other times it doesn’t work and I have to restart Zeppelin ( 
service zeppelin restart ).

This issue afflicts both 0.7.0 and 0.7.1 but I haven’t tried with previous 
versions. It also happens if Zeppelin isn’t installed as a service.

I can’t provide more detail because I can’t see any error or warning in the 
logs.. this is really strange. 

Thank you all.
Kind regards
 Pietro Pugni

Reply via email to