heyang wang created ZEPPELIN-2558:
-------------------------------------

             Summary: Livy configuration mismatch
                 Key: ZEPPELIN-2558
                 URL: https://issues.apache.org/jira/browse/ZEPPELIN-2558
             Project: Zeppelin
          Issue Type: Bug
          Components: livy-interpreter
    Affects Versions: 0.7.1
            Reporter: heyang wang


I am using  zeppelin 0.7.1 with livy-0.4-snapshot. When I edit the  livy 
interpreter setting related to Spark resource  in zeppelin web ui. I would get 
the  following error from yarn application master.

Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit 
exceeded 
        at org.apache.xerces.dom.DeferredDocumentImpl.getNodeObject(Unknown 
Source) 
        at 
org.apache.xerces.dom.DeferredDocumentImpl.synchronizeChildren(Unknown Source) 
        at 
org.apache.xerces.dom.DeferredElementNSImpl.synchronizeChildren(Unknown Source) 
        at org.apache.xerces.dom.ParentNode.hasChildNodes(Unknown Source) 
        at 
org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2551) 
        at 
org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2444) 
        at 
org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2361) 
        at org.apache.hadoop.conf.Configuration.get(Configuration.java:968) 
        at 
org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:987) 
        at 
org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:1388) 
        at 
org.apache.hadoop.security.SecurityUtil.<clinit>(SecurityUtil.java:70) 
        at 
org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:272)
 
        at 
org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:311)
 
        at 
org.apache.spark.deploy.SparkHadoopUtil.<init>(SparkHadoopUtil.scala:55) 
        at 
org.apache.spark.deploy.yarn.YarnSparkHadoopUtil.<init>(YarnSparkHadoopUtil.scala:56)
 
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
Method) 
        at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
 
        at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
 
        at java.lang.reflect.Constructor.newInstance(Constructor.java:422) 
        at java.lang.Class.newInstance(Class.java:442) 
        at 
org.apache.spark.deploy.SparkHadoopUtil$.liftedTree1$1(SparkHadoopUtil.scala:414)
 
        at 
org.apache.spark.deploy.SparkHadoopUtil$.yarn$lzycompute(SparkHadoopUtil.scala:412)
 
        at 
org.apache.spark.deploy.SparkHadoopUtil$.yarn(SparkHadoopUtil.scala:412) 
        at 
org.apache.spark.deploy.SparkHadoopUtil$.get(SparkHadoopUtil.scala:437) 
        at 
org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:747)
 
        at 
org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)

It turned out the above error is caused from mismatch in zeppelin-livy 
interpreter configuration with livy server configuration.
In zeppelin logs, I can see zeppelin is posting the following json to livy 
server:

DEBUG [2017-05-17 11:29:39,821] ({pool-2-thread-9} 
HttpAccessor.java[createRequest]:79) - Created POST request for 
"http://10.204.11.182:8998/sessions";
DEBUG [2017-05-17 11:29:39,821] ({pool-2-thread-9} 
RestTemplate.java[doWithRequest]:746) - Setting request Accept header to 
[text/plain, application/json, application/*+json, */*]
DEBUG [2017-05-17 11:29:39,821] ({pool-2-thread-9} 
RestTemplate.java[doWithRequest]:841) - Writing [{
  "kind": "pyspark",
  "proxyUser": "heyang.w...@ucarinc.com",
  "conf": {
    "spark.executor.memory": "2",
    "spark.driver.memory": "4",
    "spark.driver.cores": "1",
    "spark.executor.cores": "1",
    "spark.executor.instances": "10"
  }

However, from https://github.com/cloudera/livy, livy server accept 
configurations like the following:
driverMemory    Amount of memory to use for the driver process  string
driverCores     Number of cores to use for the driver process   int
executorMemory  Amount of memory to use per executor process    string
executorCores   Number of cores to use for each executor        int
numExecutors    Number of executors to launch for this session  int
archives        Archives to be used in this session     List of string

It's obvious that there is  mismatch between zeppelin and livy related to spark 
resource specification. I am not sure whether Zeppelin or Livy should fix this.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to