spark-submit would only run when you run the first paragraph using spark interpreter. After that, paragraph would send code to the spark app to execute.
>>> Also spark standalone cluster moder should work even before this new release, right? I didn't verify that, not sure whether other people veryfit. ankit jain <ankitjain....@gmail.com>于2018年3月15日周四 上午4:32写道: > Also spark standalone cluster moder should work even before this new > release, right? > > On Wed, Mar 14, 2018 at 8:43 AM, ankit jain <ankitjain....@gmail.com> > wrote: > >> Hi Jhang, >> Not clear on that - I thought spark-submit was done when we run a >> paragraph, how does the .sh file come into play? >> >> Thanks >> Ankit >> >> On Tue, Mar 13, 2018 at 5:43 PM, Jeff Zhang <zjf...@gmail.com> wrote: >> >>> >>> spark-submit is called in bin/interpreter.sh, I didn't try standalone >>> cluster mode. It is expected to run driver in separate host, but didn't >>> guaranteed zeppelin support this. >>> >>> Ankit Jain <ankitjain....@gmail.com>于2018年3月14日周三 上午8:34写道: >>> >>>> Hi Jhang, >>>> What is the expected behavior with standalone cluster mode? Should we >>>> see separate driver processes in the cluster(one per user) or multiple >>>> SparkSubmit processes? >>>> >>>> I was trying to dig in Zeppelin code & didn’t see where Zeppelin does >>>> the Spark-submit to the cluster? Can you please point to it? >>>> >>>> Thanks >>>> Ankit >>>> >>>> On Mar 13, 2018, at 5:25 PM, Jeff Zhang <zjf...@gmail.com> wrote: >>>> >>>> >>>> ZEPPELIN-2898 <https://issues.apache.org/jira/browse/ZEPPELIN-2898> is >>>> for yarn cluster model. And Zeppelin have integration test for yarn mode, >>>> so guaranteed it would work. But don't' have test for standalone, so not >>>> sure the behavior of standalone mode. >>>> >>>> >>>> Ruslan Dautkhanov <dautkha...@gmail.com>于2018年3月14日周三 上午8:06写道: >>>> >>>>> https://github.com/apache/zeppelin/pull/2577 pronounces yarn-cluster >>>>> in it's title so I assume it's only yarn-cluster. >>>>> Never used standalone-cluster myself. >>>>> >>>>> Which distro of Hadoop do you use? >>>>> Cloudera desupported standalone in CDH 5.5 and will remove in CDH 6. >>>>> >>>>> https://www.cloudera.com/documentation/enterprise/release-notes/topics/rg_deprecated.html >>>>> >>>>> >>>>> >>>>> -- >>>>> Ruslan Dautkhanov >>>>> >>>>> On Tue, Mar 13, 2018 at 5:45 PM, Jhon Anderson Cardenas Diaz < >>>>> jhonderson2...@gmail.com> wrote: >>>>> >>>>>> Does this new feature work only for yarn-cluster ?. Or for spark >>>>>> standalone too ? >>>>>> >>>>> El mar., 13 de mar. de 2018 18:34, Ruslan Dautkhanov < >>>>>> dautkha...@gmail.com> escribió: >>>>>> >>>>> > Zeppelin version: 0.8.0 (merged at September 2017 version) >>>>>>> >>>>>>> https://issues.apache.org/jira/browse/ZEPPELIN-2898 was merged end >>>>>>> of September so not sure if you have that. >>>>>>> >>>>>>> Check out >>>>>>> https://medium.com/@zjffdu/zeppelin-0-8-0-new-features-ea53e8810235 >>>>>>> how to set this up. >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Ruslan Dautkhanov >>>>>>> >>>>>>> On Tue, Mar 13, 2018 at 5:24 PM, Jhon Anderson Cardenas Diaz < >>>>>>> jhonderson2...@gmail.com> wrote: >>>>>>> >>>>>> Hi zeppelin users ! >>>>>>>> >>>>>>>> I am working with zeppelin pointing to a spark in standalone. I am >>>>>>>> trying to figure out a way to make zeppelin runs the spark driver >>>>>>>> outside >>>>>>>> of client process that submits the application. >>>>>>>> >>>>>>>> According with the documentation ( >>>>>>>> http://spark.apache.org/docs/2.1.1/spark-standalone.html): >>>>>>>> >>>>>>>> *For standalone clusters, Spark currently supports two deploy >>>>>>>> modes. In client mode, the driver is launched in the same process as >>>>>>>> the >>>>>>>> client that submits the application. In cluster mode, however, the >>>>>>>> driver >>>>>>>> is launched from one of the Worker processes inside the cluster, and >>>>>>>> the >>>>>>>> client process exits as soon as it fulfills its responsibility of >>>>>>>> submitting the application without waiting for the application to >>>>>>>> finish.* >>>>>>>> >>>>>>>> The problem is that, even when I set the properties for >>>>>>>> spark-standalone cluster and deploy mode in cluster, the driver still >>>>>>>> run >>>>>>>> inside zeppelin machine (according with spark UI/executors page). >>>>>>>> These are >>>>>>>> properties that I am setting for the spark interpreter: >>>>>>>> >>>>>>>> master: spark://<master-name>:7077 >>>>>>>> spark.submit.deployMode: cluster >>>>>>>> spark.executor.memory: 16g >>>>>>>> >>>>>>>> Any ideas would be appreciated. >>>>>>>> >>>>>>>> Thank you >>>>>>>> >>>>>>>> Details: >>>>>>>> Spark version: 2.1.1 >>>>>>>> Zeppelin version: 0.8.0 (merged at September 2017 version) >>>>>>>> >>>>>>> >> >> >> -- >> Thanks & Regards, >> Ankit. >> > > > > -- > Thanks & Regards, > Ankit. >