spark-submit is called in bin/interpreter.sh, I didn't try standalone cluster mode. It is expected to run driver in separate host, but didn't guaranteed zeppelin support this.
Ankit Jain <ankitjain....@gmail.com>于2018年3月14日周三 上午8:34写道: > Hi Jhang, > What is the expected behavior with standalone cluster mode? Should we see > separate driver processes in the cluster(one per user) or multiple > SparkSubmit processes? > > I was trying to dig in Zeppelin code & didn’t see where Zeppelin does the > Spark-submit to the cluster? Can you please point to it? > > Thanks > Ankit > > On Mar 13, 2018, at 5:25 PM, Jeff Zhang <zjf...@gmail.com> wrote: > > > ZEPPELIN-2898 <https://issues.apache.org/jira/browse/ZEPPELIN-2898> is > for yarn cluster model. And Zeppelin have integration test for yarn mode, > so guaranteed it would work. But don't' have test for standalone, so not > sure the behavior of standalone mode. > > > Ruslan Dautkhanov <dautkha...@gmail.com>于2018年3月14日周三 上午8:06写道: > >> https://github.com/apache/zeppelin/pull/2577 pronounces yarn-cluster in >> it's title so I assume it's only yarn-cluster. >> Never used standalone-cluster myself. >> >> Which distro of Hadoop do you use? >> Cloudera desupported standalone in CDH 5.5 and will remove in CDH 6. >> >> https://www.cloudera.com/documentation/enterprise/release-notes/topics/rg_deprecated.html >> >> >> >> -- >> Ruslan Dautkhanov >> >> On Tue, Mar 13, 2018 at 5:45 PM, Jhon Anderson Cardenas Diaz < >> jhonderson2...@gmail.com> wrote: >> >>> Does this new feature work only for yarn-cluster ?. Or for spark >>> standalone too ? >>> >> El mar., 13 de mar. de 2018 18:34, Ruslan Dautkhanov < >>> dautkha...@gmail.com> escribió: >>> >> > Zeppelin version: 0.8.0 (merged at September 2017 version) >>>> >>>> https://issues.apache.org/jira/browse/ZEPPELIN-2898 was merged end of >>>> September so not sure if you have that. >>>> >>>> Check out >>>> https://medium.com/@zjffdu/zeppelin-0-8-0-new-features-ea53e8810235 >>>> how to set this up. >>>> >>>> >>>> -- >>>> Ruslan Dautkhanov >>>> >>>> On Tue, Mar 13, 2018 at 5:24 PM, Jhon Anderson Cardenas Diaz < >>>> jhonderson2...@gmail.com> wrote: >>>> >>> Hi zeppelin users ! >>>>> >>>>> I am working with zeppelin pointing to a spark in standalone. I am >>>>> trying to figure out a way to make zeppelin runs the spark driver outside >>>>> of client process that submits the application. >>>>> >>>>> According with the documentation ( >>>>> http://spark.apache.org/docs/2.1.1/spark-standalone.html): >>>>> >>>>> *For standalone clusters, Spark currently supports two deploy modes. >>>>> In client mode, the driver is launched in the same process as the client >>>>> that submits the application. In cluster mode, however, the driver is >>>>> launched from one of the Worker processes inside the cluster, and the >>>>> client process exits as soon as it fulfills its responsibility of >>>>> submitting the application without waiting for the application to finish.* >>>>> >>>>> The problem is that, even when I set the properties for >>>>> spark-standalone cluster and deploy mode in cluster, the driver still run >>>>> inside zeppelin machine (according with spark UI/executors page). These >>>>> are >>>>> properties that I am setting for the spark interpreter: >>>>> >>>>> master: spark://<master-name>:7077 >>>>> spark.submit.deployMode: cluster >>>>> spark.executor.memory: 16g >>>>> >>>>> Any ideas would be appreciated. >>>>> >>>>> Thank you >>>>> >>>>> Details: >>>>> Spark version: 2.1.1 >>>>> Zeppelin version: 0.8.0 (merged at September 2017 version) >>>>> >>>>