2637 change

Panchappanavar, Naveenakumar Gurushantap (Nokia - IN/Bangalore) Wed, 09 May 2018 01:58:09 -0700

Hi All,

We had one of the requirement of supporting zeppelin deployment over Kubernetes 
cluster and running spark jobs over zeppelin through spark interpreter.



  *   As a first step, we have cloned the zeppelin source code having last 
commit id: 685eb9249d1c1d821ce57f1ed0559f1539dfbe69
  *   While building source code with the above commit id we faced following 
issues:

  1.  after cloning the source to particular directory, there are some 
modifications needed. Otherwise user will get the errors related to 'npm' is 
not installed properly and the directory where actually the zeppelin code is 
cloned the folder should be three level inner folder of root folder of the 
machine where actually it has been cloned. Otherwise internally code will be 
not able to create plugin folder after three level up from the cloned folder. 
<Refer the pom.xml zeppelin-distribution of zeppelin source code>
  2.  In our environment proxy urls should be added to .bowerrc file of the 
zeppelin-web project of the zeppelin source code. This can be updated in the 
build user guide.
  3.  All profiles are not correctly supported by the zeppelin code. Some 
specific profiles needs to be added during the build time. < -Pyarn -Ppyspark 
>So the following maven goal needs to be executed from zeppelin parent 
directory to perform zeppelin code build.
  4.  maven command to execute from the zeppelin parent directory.
mvn clean install -DskipTests -Drat.skip=true -Pspark-2.2 -Phadoop2  
-Pscala-2.11 -s settings.xml
  5.  after executing the above mvn command build was successful on all 49 sub 
module packages of zeppelin.
  6.  Then cd into zeppelin-distribution and execute following mvn goal.

mvn org.apache.maven.plugins:maven-assembly-plugin:3.0.0:single -P 
apache-release.

  *   After building the source code we took 
https://github.com/apache/zeppelin/pull/2637 this PR ,added to the code and 
built the source code.
  *   And when the zeppelin is deployed to kubernates cluster, and spark job is 
being submitted, drivers and executors came up but were referring to the spark 
interpreter class SparkInterpreterLauncher.java  and not 
SparkK8SInterpreterLauncher.java of above mentioned PR
  *   After reviewing the PR, we found that there is one condition in 
'interpretersetting.java'  that checks if deploy mode is cluster and then only 
launches SparkK8SInterpreterLauncher.java otherwise launches 
SparkInterpreterLauncher.java code.
  *   Setting the value from the Zeppelin GUI did not work.
  *   So we modified interpreter-setting.json file of 
zeppelin-0.9.0-SNAPSHOT/interpreter/spark to include spark.submit.deployMode as 
cluster
  *   After these changes, when the driver is spawned it gives the following 
exception:

ClassNotFoundException:org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer
 as it is using spark-interpreter-0.9.0-SNAPSHOT.jar

during spark submit job,

  *   To resolve the above issue, interpreter.sh was modified to execute the 
spark submit command using zeppelin-interpreter-0.9.0-SNAPSHOT.jar
  *   when above mentioned jar is used to spawn spark submit command one more 
exception was encountered
     *   ClassNotFoundException cerner.ether class not found
  *   to resolve ClassNotFoundException cerner.ether class error 
spark-interpreter-0.9.0-SNAPSHOT.jar needs to be mentioned in classpath using 
--jars option during spark submit.

  *   After integrating all above changes, zeppelin submit of spark jobs 
started using SparkK8SInterpreterLauncher.java.
Now our proposal is after doing all above modifications should it go for 
upstream for merging with the zeppelin source code or not.
Please let us know if the above procedure we followed is proper way or not.
One more question when will the above mentioned PR is planned for merging with 
the source code. Some changes may be needed for the K8_URL handling.
We can help with community developers so that these changes can be integrated 
to the zeppelin source code.
Regards
Naveen

merging of https://github.com/apache/zeppelin/pull/2637 change

Reply via email to