Hi Paul, I think I found the root cause, it is indeed an issue in SparkInterpreterLauncher. Could you create a ticket for it ?
Paul Brenner <pbren...@placeiq.com> 于2021年3月30日周二 上午2:14写道: > We built from master on March 26th, looks like the most recent commit > was 85ed8e2e51e1ea10df38d4710216343efe218d60. > > Not sure the best way to share my full interpreter settings so here they > are from the json. Before this build this interpreter was working fine with > deployMode: cluster. Now I have to use deployMode: client if I want to run. > Cluster gives the error message immediately. > > > "spark_paul": { > "id": "spark_paul", > "name": "spark_paul", > "group": "spark", > "properties": { > "SPARK_HOME": { > "name": "SPARK_HOME", > "value": "", > "type": "string", > "description": "Location of spark distribution" > }, > "spark.master": { > "name": "spark.master", > "value": "yarn", > "type": "string", > "description": "Spark master uri. local | yarn-client | yarn-cluster | > spark master address of standalone mode, ex) spark://master_host:7077" > }, > "spark.submit.deployMode": { > "name": "spark.submit.deployMode", > "value": "client", > "type": "string", > "description": "The deploy mode of Spark driver program, either > \"client\" or \"cluster\", Which means to launch driver program locally > (\"client\") or remotely (\"cluster\") on one of the nodes inside the > cluster." > }, > "spark.app.name": { > "name": "spark.app.name", > "value": "zeppelin_dev_paul", > "type": "string", > "description": "The name of spark application." > }, > "spark.driver.cores": { > "name": "spark.driver.cores", > "value": "1", > "type": "number", > "description": "Number of cores to use for the driver process, only in > cluster mode." > }, > "spark.driver.memory": { > "name": "spark.driver.memory", > "value": "5g", > "type": "string", > "description": "Amount of memory to use for the driver process, i.e. > where SparkContext is initialized, in the same format as JVM memory strings > with a size unit suffix (\"k\", \"m\", \"g\" or \"t\") (e.g. 512m, 2g)." > }, > "spark.executor.cores": { > "name": "spark.executor.cores", > "value": "1", > "type": "number", > "description": "The number of cores to use on each executor" > }, > "spark.executor.memory": { > "name": "spark.executor.memory", > "value": "3g", > "type": "string", > "description": "Executor memory per worker instance. ex) 512m, 32g" > }, > "spark.executor.instances": { > "name": "spark.executor.instances", > "value": "2", > "type": "number", > "description": "The number of executors for static allocation." > }, > "spark.files": { > "name": "spark.files", > "value": "", > "type": "string", > "description": "Comma-separated list of files to be placed in the working > directory of each executor. Globs are allowed." > }, > "spark.jars": { > "name": "spark.jars", > "value": " > http://nexus.placeiq.net:8081/nexus/content/repositories/releases/com/placeiq/lap/4.1.25/lap-4.1.25.jar,hdfs://gandalf-nn.placeiq.net/lib/dap/0.1.0/dap-jar-assembled.jar" > ;, > "type": "string", > "description": "Comma-separated list of jars to include on the driver and > executor classpaths. Globs are allowed." > }, > "spark.jars.packages": { > "name": "spark.jars.packages", > "value": "ds-commons:ds-commons_2.11:0.1-SNAPSHOT", > "type": "string", > "description": "Comma-separated list of Maven coordinates of jars to > include on the driver and executor classpaths. The coordinates should be > groupId:artifactId:version. If spark.jars.ivySettings is given artifacts > will be resolved according to the configuration in the file, otherwise > artifacts will be searched for in the local maven repo, then maven central > and finally any additional remote repositories given by the command-line > option --repositories." > }, > "zeppelin.spark.useHiveContext": { > "name": "zeppelin.spark.useHiveContext", > "value": true, > "type": "checkbox", > "description": "Use HiveContext instead of SQLContext if it is true. > Enable hive for SparkSession." > }, > "zeppelin.spark.printREPLOutput": { > "name": "zeppelin.spark.printREPLOutput", > "value": true, > "type": "checkbox", > "description": "Print REPL output" > }, > "zeppelin.spark.maxResult": { > "name": "zeppelin.spark.maxResult", > "value": "1000", > "type": "number", > "description": "Max number of result to display." > }, > "zeppelin.spark.enableSupportedVersionCheck": { > "name": "zeppelin.spark.enableSupportedVersionCheck", > "value": true, > "type": "checkbox", > "description": "Whether checking supported spark version. Developer only > setting, not for production use" > }, > "zeppelin.spark.uiWebUrl": { > "name": "zeppelin.spark.uiWebUrl", > "value": "", > "type": "string", > "description": "Override Spark UI default URL. In Kubernetes mode, value > can be Jinja template string with 3 template variables \u0027PORT\u0027, > \u0027SERVICE_NAME\u0027 and \u0027SERVICE_DOMAIN\u0027. (ex: http:// > {{PORT}}-{{SERVICE_NAME}}.{{SERVICE_DOMAIN}})" > }, > "zeppelin.spark.ui.hidden": { > "name": "zeppelin.spark.ui.hidden", > "value": false, > "type": "checkbox", > "description": "Whether hide spark ui in zeppelin ui" > }, > "spark.webui.yarn.useProxy": { > "name": "spark.webui.yarn.useProxy", > "value": false, > "type": "checkbox", > "description": "whether use yarn proxy url as spark weburl, e.g. > http://localhost:8088/proxy/application_1583396598068_0004"; > }, > "zeppelin.spark.scala.color": { > "name": "zeppelin.spark.scala.color", > "value": true, > "type": "checkbox", > "description": "Whether enable color output of spark scala interpreter" > }, > "zeppelin.spark.deprecatedMsg.show": { > "name": "zeppelin.spark.deprecatedMsg.show", > "value": true, > "type": "checkbox", > "description": "Whether show the spark deprecated message, spark 2.2 and > before are deprecated. Zeppelin will display warning message by default" > }, > "zeppelin.spark.concurrentSQL": { > "name": "zeppelin.spark.concurrentSQL", > "value": true, > "type": "checkbox", > "description": "Execute multiple SQL concurrently if set true." > }, > "zeppelin.spark.concurrentSQL.max": { > "name": "zeppelin.spark.concurrentSQL.max", > "value": "10", > "type": "number", > "description": "Max number of SQL concurrently executed" > }, > "zeppelin.spark.sql.stacktrace": { > "name": "zeppelin.spark.sql.stacktrace", > "value": true, > "type": "checkbox", > "description": "Show full exception stacktrace for SQL queries if set to > true." > }, > "zeppelin.spark.sql.interpolation": { > "name": "zeppelin.spark.sql.interpolation", > "value": false, > "type": "checkbox", > "description": "Enable ZeppelinContext variable interpolation into spark > sql" > }, > "PYSPARK_PYTHON": { > "name": "PYSPARK_PYTHON", > "value": "python", > "type": "string", > "description": "Python binary executable to use for PySpark in both > driver and workers (default is python2.7 if available, otherwise python). > Property `spark.pyspark.python` take precedence if it is set" > }, > "PYSPARK_DRIVER_PYTHON": { > "name": "PYSPARK_DRIVER_PYTHON", > "value": "python", > "type": "string", > "description": "Python binary executable to use for PySpark in driver > only (default is `PYSPARK_PYTHON`). Property `spark.pyspark.driver.python` > take precedence if it is set" > }, > "zeppelin.pyspark.useIPython": { > "name": "zeppelin.pyspark.useIPython", > "value": true, > "type": "checkbox", > "description": "Whether use IPython when it is available" > }, > "zeppelin.R.knitr": { > "name": "zeppelin.R.knitr", > "value": true, > "type": "checkbox", > "description": "Whether use knitr or not" > }, > "zeppelin.R.cmd": { > "name": "zeppelin.R.cmd", > "value": "R", > "type": "string", > "description": "R binary executable path" > }, > "zeppelin.R.image.width": { > "name": "zeppelin.R.image.width", > "value": "100%", > "type": "number", > "description": "Image width of R plotting" > }, > "zeppelin.R.render.options": { > "name": "zeppelin.R.render.options", > "value": "out.format \u003d \u0027html\u0027, comment \u003d NA, echo > \u003d FALSE, results \u003d \u0027asis\u0027, message \u003d F, warning > \u003d F, fig.retina \u003d 2", > "type": "textarea", > "description": "" > }, > "zeppelin.kotlin.shortenTypes": { > "name": "zeppelin.kotlin.shortenTypes", > "value": true, > "type": "checkbox", > "description": "Show short types instead of full, e.g. > List\u003cString\u003e or kotlin.collections.List\u003ckotlin.String\u003e" > }, > "spark.dynamicAllocation.executorIdleTimeout": { > "name": "spark.dynamicAllocation.executorIdleTimeout", > "value": "2m", > "type": "textarea" > }, > "spark.dynamicAllocation.enabled": { > "name": "spark.dynamicAllocation.enabled", > "value": "true", > "type": "textarea" > }, > "spark.dynamicAllocation.minExecutors": { > "name": "spark.dynamicAllocation.minExecutors", > "value": "4", > "type": "textarea" > }, > "spark.shuffle.service.enabled": { > "name": "spark.shuffle.service.enabled", > "value": "true", > "type": "textarea" > }, > "spark.yarn.queue": { > "name": "spark.yarn.queue", > "value": "pbrenner", > "type": "textarea" > }, > "spark.dynamicAllocation.cachedExecutorIdleTimeout": { > "name": "spark.dynamicAllocation.cachedExecutorIdleTimeout", > "value": "2m", > "type": "textarea" > }, > "spark.jars.repositories": { > "name": "spark.jars.repositories", > "value": " > http://nexus.placeiq.net:8081/nexus/content/repositories/snapshots";, > "type": "textarea" > }, > "spark.executor.memoryOverhead": { > "name": "spark.executor.memoryOverhead", > "value": "4g", > "type": "textarea" > }, > "zeppelin.interpreter.connect.timeout": { > "name": "zeppelin.interpreter.connect.timeout", > "value": "300000", > "type": "textarea" > } > }, > "status": "READY", > "interpreterGroup": [ > { > "name": "spark", > "class": "org.apache.zeppelin.spark.SparkInterpreter", > "defaultInterpreter": true, > "editor": { > "language": "scala", > "editOnDblClick": false, > "completionKey": "TAB", > "completionSupport": true > } > }, > { > "name": "sql", > "class": "org.apache.zeppelin.spark.SparkSqlInterpreter", > "defaultInterpreter": false, > "editor": { > "language": "sql", > "editOnDblClick": false, > "completionKey": "TAB", > "completionSupport": true > } > }, > { > "name": "pyspark", > "class": "org.apache.zeppelin.spark.PySparkInterpreter", > "defaultInterpreter": false, > "editor": { > "language": "python", > "editOnDblClick": false, > "completionKey": "TAB", > "completionSupport": true > } > }, > { > "name": "ipyspark", > "class": "org.apache.zeppelin.spark.IPySparkInterpreter", > "defaultInterpreter": false, > "editor": { > "language": "python", > "editOnDblClick": false, > "completionSupport": true, > "completionKey": "TAB" > } > }, > { > "name": "r", > "class": "org.apache.zeppelin.spark.SparkRInterpreter", > "defaultInterpreter": false, > "editor": { > "language": "r", > "editOnDblClick": false, > "completionSupport": false, > "completionKey": "TAB" > } > }, > { > "name": "ir", > "class": "org.apache.zeppelin.spark.SparkIRInterpreter", > "defaultInterpreter": false, > "editor": { > "language": "r", > "editOnDblClick": false, > "completionSupport": true, > "completionKey": "TAB" > } > }, > { > "name": "shiny", > "class": "org.apache.zeppelin.spark.SparkShinyInterpreter", > "defaultInterpreter": false, > "editor": { > "language": "r", > "editOnDblClick": false, > "completionSupport": true, > "completionKey": "TAB" > } > }, > { > "name": "kotlin", > "class": "org.apache.zeppelin.spark.KotlinSparkInterpreter", > "defaultInterpreter": false, > "editor": { > "language": "kotlin", > "editOnDblClick": false, > "completionKey": "TAB", > "completionSupport": false > } > } > ], > "dependencies": [], > "option": { > "remote": true, > "port": -1, > "perNote": "isolated", > "perUser": "isolated", > "isExistingProcess": false, > "setPermission": false, > "owners": [], > "isUserImpersonate": true > } > }, > > Paul Brenner > > Head of Data Science > *pbren...@placeiq.com* <pbren...@placeiq.com> | (217) 390-3033 | > *www.placeiq.com* <https://www.placeiq.com> > twitter *@placeiq* <https://twitter.com/PlaceIQ> linkedin */placeiq* > <https://www.linkedin.com/company/placeiq/> > [image: Check out PlaceIQ's latest COVID-19 research] > <https://www.placeiq.com/blog/?utm_source=email&utm_medium=signature&utm_campaign=blogsubscriber> > [image: PlaceIQ: Oracle Data Cloud Premier Data Provider] > <http://go.placeiq.com/WC01DPG0l0N3Gq000000e00> > On Mar 27, 2021, 8:22 AM -0400, Jeff Zhang <zjf...@gmail.com>, wrote: > > Hi Paul, > > Could you share your spark interpreter setting ? And which commit id do > you build from ? > > For this kind of thing, if you are sure this is a bug, then you can create > a ticket directly. If you are not sure, you can then ask question in > user/dev mail list, if we confirm this is a bug, then you can create ticket > for it. And Welcome any kinds of discussion in mail list or slack > channel. > > > > Paul Brenner <pbren...@placeiq.com> 于2021年3月27日周六 上午12:32写道: > >> We just built from the latest source code today to help test the problem >> in this open ticket <https://issues.apache.org/jira/browse/ZEPPELIN-4599>. >> Unfortunately, cluster mode now appears broken (client mode seems to work >> fine). We see the following error when tryin to start any interpreter in >> cluster mode: >> >> org.apache.zeppelin.interpreter.InterpreterException: java.io.IOException: >> Fail to set additional jars for spark interpreter >> at >> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.open(RemoteInterpreter.java:129) >> at >> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:271) >> at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:442) >> at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:71) >> at org.apache.zeppelin.scheduler.Job.run(Job.java:172) >> at >> org.apache.zeppelin.scheduler.AbstractScheduler.runJob(AbstractScheduler.java:132) >> at >> org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:182) >> at >> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) >> at java.util.concurrent.FutureTask.run(FutureTask.java:266) >> at >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) >> at >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) >> at java.lang.Thread.run(Thread.java:748) >> Caused by: java.io.IOException: Fail to set additional jars for spark >> interpreter >> at >> org.apache.zeppelin.interpreter.launcher.SparkInterpreterLauncher.buildEnvFromProperties(SparkInterpreterLauncher.java:163) >> at >> org.apache.zeppelin.interpreter.launcher.StandardInterpreterLauncher.launchDirectly(StandardInterpreterLauncher.java:77) >> at >> org.apache.zeppelin.interpreter.launcher.InterpreterLauncher.launch(InterpreterLauncher.java:110) >> at >> org.apache.zeppelin.interpreter.InterpreterSetting.createInterpreterProcess(InterpreterSetting.java:847) >> at >> org.apache.zeppelin.interpreter.ManagedInterpreterGroup.getOrCreateInterpreterProcess(ManagedInterpreterGroup.java:66) >> at >> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getOrCreateInterpreterProcess(RemoteInterpreter.java:104) >> at >> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.internal_create(RemoteInterpreter.java:154) >> at >> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.open(RemoteInterpreter.java:126) >> ... 13 more >> Caused by: java.io.IOException: Cannot run program "null/bin/spark-submit": >> error=2, No such file or directory >> at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048) >> at >> org.apache.zeppelin.interpreter.launcher.SparkInterpreterLauncher.detectSparkScalaVersion(SparkInterpreterLauncher.java:233) >> at >> org.apache.zeppelin.interpreter.launcher.SparkInterpreterLauncher.buildEnvFromProperties(SparkInterpreterLauncher.java:127) >> ... 20 more >> Caused by: java.io.IOException: error=2, No such file or directory >> at java.lang.UNIXProcess.forkAndExec(Native Method) >> at java.lang.UNIXProcess.<init>(UNIXProcess.java:247) >> at java.lang.ProcessImpl.start(ProcessImpl.java:134) >> at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029) >> ... 22 more >> >> >> >> I’m actually not sure, what is the best way to report and work with devs >> on an error like this? Should I open a ticket? Post in the slack? Send an >> email to the user list like I’m doing? I’d like to learn to be a better >> citizen in this community. >> >> Thanks! >> >> Paul Brenner >> >> Head of Data Science >> *pbren...@placeiq.com* <pbren...@placeiq.com> | (217) 390-3033 | >> *www.placeiq.com* <https://www.placeiq.com> >> twitter *@placeiq* <https://twitter.com/PlaceIQ> linkedin */placeiq* >> <https://www.linkedin.com/company/placeiq/> >> [image: Check out PlaceIQ's latest COVID-19 research] >> <https://www.placeiq.com/blog/?utm_source=email&utm_medium=signature&utm_campaign=blogsubscriber> >> [image: PlaceIQ: Oracle Data Cloud Premier Data Provider] >> <http://go.placeiq.com/WC01DPG0l0N3Gq000000e00> >> > > > -- > Best Regards > > Jeff Zhang > > -- Best Regards Jeff Zhang