[
https://issues.apache.org/jira/browse/SPARK-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15555581#comment-15555581
]
Laurent Hoss commented on SPARK-10643:
--------------------------------------
+1
would be very useful when using Zeppelin (running in docker) on a *mesos*
cluster
Unfort. the Zeppelin GUI does not either/yet support adding jars from HDFS,
nor does it support some kind of http-upload, instead only from local dirs (not
practical when inside docker) or from a (custom) maven-repos (not ideal for
quick dev iterations).
After I learned that it should work with 'cluster mode' I tried to submit a
spark job in cluster mode, within zeppelin
but it then failed because it can't find the builtin zeppelin-spark interpreter
jar (when driver is ran in the 'cluster).
Not sure yet if that's actually an issue (as I'ld assume spark taking care to
transfer the provided '--jars' ..)
> Support HDFS application download in client mode spark submit
> -------------------------------------------------------------
>
> Key: SPARK-10643
> URL: https://issues.apache.org/jira/browse/SPARK-10643
> Project: Spark
> Issue Type: New Feature
> Components: Spark Submit
> Reporter: Alan Braithwaite
> Priority: Minor
>
> When using mesos with docker and marathon, it would be nice to be able to
> make spark-submit deployable on marathon and have that download a jar from
> HDFS instead of having to package the jar with the docker.
> {code}
> $ docker run -it docker.example.com/spark:latest
> /usr/local/spark/bin/spark-submit --class
> com.example.spark.streaming.EventHandler hdfs://hdfs/tmp/application.jar
> Warning: Skip remote jar hdfs://hdfs/tmp/application.jar.
> java.lang.ClassNotFoundException: com.example.spark.streaming.EventHandler
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:348)
> at org.apache.spark.util.Utils$.classForName(Utils.scala:173)
> at
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:639)
> at
> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> {code}
> Although I'm aware that we can run in cluster mode with mesos, we've already
> built some nice tools surrounding marathon for logging and monitoring.
> Code in question:
> https://github.com/apache/spark/blob/132718ad7f387e1002b708b19e471d9cd907e105/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L723-L736
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]