Arnaud Nauwynck created ZEPPELIN-5817:
-----------------------------------------

             Summary: Failed to run spark job from Zeppelin on windows... can 
not execute "spark-submit" is not a valid Win32 application, need to call 
cmd.exe
                 Key: ZEPPELIN-5817
                 URL: https://issues.apache.org/jira/browse/ZEPPELIN-5817
             Project: Zeppelin
          Issue Type: Bug
          Components: interpreter-launcher
    Affects Versions: 0.10.1, 0.10.0, 0.9.0
            Reporter: Arnaud Nauwynck



{noformat}
Caused by: java.io.IOException: Fail to detect scala version, the reason 
is:Cannot run program "C:/apps/hadoop/spark-3.1.1/bin/spark-submit": 
CreateProcess error=193, %1 is not a valid Win32 application
        at 
org.apache.zeppelin.interpreter.launcher.SparkInterpreterLauncher.buildEnvFromProperties(SparkInterpreterLauncher.java:127)
        at 
org.apache.zeppelin.interpreter.launcher.StandardInterpreterLauncher.launchDirectly(StandardInterpreterLauncher.java:77)
        at 
org.apache.zeppelin.interpreter.launcher.InterpreterLauncher.launch(InterpreterLauncher.java:110)
{noformat}

Indeed, looking at source code, we can see it might run only on linux where 
shell script file "spark-submit" have both "chmod u+x" and bang "#!/bin/bash" 
On Windows, a text file containing a shell script is not executable, because 
Windows only supports ".exe" file format

Instead on windows, it should be calling "cmd.exe" with arguments [ "/c", 
"spark-submit", "--version" ]


Source code link:

https://github.com/apache/zeppelin/blob/master/zeppelin-zengine/src/main/java/org/apache/zeppelin/interpreter/launcher/SparkInterpreterLauncher.java#L270

{noformat}
  private String detectSparkScalaVersion(String sparkHome, Map<String, String> 
env) throws Exception {
...
    ProcessBuilder builder = new ProcessBuilder(sparkHome + 
"/bin/spark-submit", "--version");
...
    Process process = builder.start();

{noformat}


And there is no possibility to by-pass this, it is always called from 
buildEnvFromProperties() : 

https://github.com/apache/zeppelin/blob/master/zeppelin-zengine/src/main/java/org/apache/zeppelin/interpreter/launcher/SparkInterpreterLauncher.java#L134

{noformat}
  @Override
  public Map<String, String> buildEnvFromProperties(InterpreterLaunchContext 
context) throws IOException {

..

    String scalaVersion = null;
    try {
      String sparkHome = getEnv("SPARK_HOME");
      LOGGER.info("SPARK_HOME: {}", sparkHome);
      scalaVersion = detectSparkScalaVersion(sparkHome, env);
      LOGGER.info("Scala version for Spark: {}", scalaVersion);
      context.getProperties().put("zeppelin.spark.scala.version", scalaVersion);
    } catch (Exception e) {
      throw new IOException("Fail to detect scala version, the reason is:"+ 
e.getMessage());
    }
..
{noformat}

Then calling




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to