heyang wang created ZEPPELIN-2558: ------------------------------------- Summary: Livy configuration mismatch Key: ZEPPELIN-2558 URL: https://issues.apache.org/jira/browse/ZEPPELIN-2558 Project: Zeppelin Issue Type: Bug Components: livy-interpreter Affects Versions: 0.7.1 Reporter: heyang wang
I am using zeppelin 0.7.1 with livy-0.4-snapshot. When I edit the livy interpreter setting related to Spark resource in zeppelin web ui. I would get the following error from yarn application master. Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded at org.apache.xerces.dom.DeferredDocumentImpl.getNodeObject(Unknown Source) at org.apache.xerces.dom.DeferredDocumentImpl.synchronizeChildren(Unknown Source) at org.apache.xerces.dom.DeferredElementNSImpl.synchronizeChildren(Unknown Source) at org.apache.xerces.dom.ParentNode.hasChildNodes(Unknown Source) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2551) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2444) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2361) at org.apache.hadoop.conf.Configuration.get(Configuration.java:968) at org.apache.hadoop.conf.Configuration.getTrimmed(Configuration.java:987) at org.apache.hadoop.conf.Configuration.getBoolean(Configuration.java:1388) at org.apache.hadoop.security.SecurityUtil.<clinit>(SecurityUtil.java:70) at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:272) at org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:311) at org.apache.spark.deploy.SparkHadoopUtil.<init>(SparkHadoopUtil.scala:55) at org.apache.spark.deploy.yarn.YarnSparkHadoopUtil.<init>(YarnSparkHadoopUtil.scala:56) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:422) at java.lang.Class.newInstance(Class.java:442) at org.apache.spark.deploy.SparkHadoopUtil$.liftedTree1$1(SparkHadoopUtil.scala:414) at org.apache.spark.deploy.SparkHadoopUtil$.yarn$lzycompute(SparkHadoopUtil.scala:412) at org.apache.spark.deploy.SparkHadoopUtil$.yarn(SparkHadoopUtil.scala:412) at org.apache.spark.deploy.SparkHadoopUtil$.get(SparkHadoopUtil.scala:437) at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:747) at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala) It turned out the above error is caused from mismatch in zeppelin-livy interpreter configuration with livy server configuration. In zeppelin logs, I can see zeppelin is posting the following json to livy server: DEBUG [2017-05-17 11:29:39,821] ({pool-2-thread-9} HttpAccessor.java[createRequest]:79) - Created POST request for "http://10.204.11.182:8998/sessions" DEBUG [2017-05-17 11:29:39,821] ({pool-2-thread-9} RestTemplate.java[doWithRequest]:746) - Setting request Accept header to [text/plain, application/json, application/*+json, */*] DEBUG [2017-05-17 11:29:39,821] ({pool-2-thread-9} RestTemplate.java[doWithRequest]:841) - Writing [{ "kind": "pyspark", "proxyUser": "heyang.w...@ucarinc.com", "conf": { "spark.executor.memory": "2", "spark.driver.memory": "4", "spark.driver.cores": "1", "spark.executor.cores": "1", "spark.executor.instances": "10" } However, from https://github.com/cloudera/livy, livy server accept configurations like the following: driverMemory Amount of memory to use for the driver process string driverCores Number of cores to use for the driver process int executorMemory Amount of memory to use per executor process string executorCores Number of cores to use for each executor int numExecutors Number of executors to launch for this session int archives Archives to be used in this session List of string It's obvious that there is mismatch between zeppelin and livy related to spark resource specification. I am not sure whether Zeppelin or Livy should fix this. -- This message was sent by Atlassian JIRA (v6.3.15#6346)