Some use cases discussed earlier on this thread: https://www.mail-archive.com/dev@zeppelin.apache.org/msg06323.html
https://www.mail-archive.com/dev@zeppelin.apache.org/msg06332.html On Wed, Jan 4, 2017 at 4:51 PM, Jianfeng (Jeff) Zhang < jzh...@hortonworks.com> wrote: > > I don¹t understand why user want to export zeppelin note as spark > application. > > If they want to trigger the running of spark app, why not use zeppelin¹s > rest api for that. Even user export it as spark application, most of time > in reality, they need to submit it through spark job server, so why not > use zeppelin as a spark job server. > And if the spark app fails, it is pretty hard to debug it, because the > exporting tool has changed/restructured the source code. > > > If this is a pretty large and complicated spark application, I don¹t think > zeppelin is a proper tool for that, they¹d better to use IDE for that > project. > > BTW, After https://github.com/apache/zeppelin/pull/1799, user can define > the dependency between paragraphs, and they can run one whole note which > contains different interpreters. > > > > Best Regard, > Jeff Zhang > > > > > > On 1/5/17, 2:25 AM, "Luciano Resende" <luckbr1...@gmail.com> wrote: > > >I have made some progress with a tool to handle the points discussed in > >this thread. It's currently a command line tool and given a Zeppelin > >notebook (note.json) it generates a Spark scala application, compiles it > >using the compiler embedded in the scala sdk and then package all these > >resources into a jar that works with spark-submit command. > > > >I would like to start prototyping the integration into the Zeppelin UI and > >I was wondering if it would be ok to use the above jar as a dependency > >(e.g. from a maven release) and integrate into zeppelin... > > > >Thoughts ? > > > > > >On Mon, Sep 19, 2016 at 7:47 AM, Sourav Mazumder < > >sourav.mazumde...@gmail.com> wrote: > > > >> To Moon's point, This is what my vision is around this feature - > >> > >> 1. Use should be able to package 1, more than one, all of the > >>paragraphs in > >> a Notebook to create a Jar file which can be used with Spark-Submit. > >> > >> 2. The tool should automatically remove the all the interactive > >>statements > >> like print, show etc. > >> > >> 3. The tool should automatically create a Main class in addition to the > >>jar > >> file(s) which will internally call the respective jar. User can then > >>change > >> this main class if needed for parameterization through Args. > >> > >> Regards, > >> Sourav > >> > >> On Mon, Sep 19, 2016 at 7:33 AM, Sourav Mazumder < > >> sourav.mazumde...@gmail.com> wrote: > >> > >> > I am also pretty much for this. > >> > > >> > I have got the similar request from each and every people/group who I > >> > showcased Zeppelin.Regards, > >> > Sourav > >> > > >> > On Fri, Sep 16, 2016 at 8:06 PM, moon soo Lee <m...@apache.org> > wrote: > >> > > >> >> Hi Luciano, > >> >> > >> >> I've also got a lot of questions about "Productize the notebook" > >>every > >> >> time > >> >> i meet users use Zeppelin in their work. > >> >> > >> >> I think it's actually about two different problems that Zeppelin > >>need to > >> >> address. > >> >> > >> >> *1) Provide way that interactive notebook becomes part of production > >> data > >> >> pipeline.* > >> >> > >> >> Although Zeppelin does have quite convenient cron-like scheduler for > >> each > >> >> Note, built-in cron scheduler is not ready for serious use in the > >> >> production. Because it lacks some features like actions after > >> >> success/fail, > >> >> fault-tolerance, history, and so on. I think community is working on > >> >> improving it, and it's going to take some time. > >> >> Meanwhile, any external enterprise level job scheduler can run Note > >>or > >> >> Paragraph via REST api. But we don't have any guide and examples for > >>it, > >> >> what are the REST APIs user can use for this purpose, and how to use > >> them > >> >> in various cases (e.g. with authentication on, dynamic form > >>parameters, > >> >> etc). I think a lot of things need to be improved to make zeppelin > >> easier > >> >> to be part of production pipeline. > >> >> > >> >> *2) Provide stable way of run spark paragraphs.* > >> >> > >> >> Another barrier of using notebook in production pipeline is Scala > >>REPL > >> in > >> >> SparkInterpreter. SparkInterpreter uses Scala REPL to provide > >> interactive > >> >> scala session and Scala REPL will eventually hit OOME as it compiles > >>and > >> >> runs statements. Current workaround in zeppelin is cron-scheduler > >>inside > >> >> of > >> >> notebook has checkbox that can restart the Note after scheduler runs > >>it. > >> >> Of course that option does not apply when external scheduler runs job > >> >> through REST api. > >> >> > >> >> I think what Luciano suggesting, "Export Spark Paragraph as Spark > >> >> application" is interesting. If Spark Paragraphs can be easily > >>packaged > >> >> into jar (spark application) that can be one of way to address 1) and > >> 2). > >> >> In case of user already have stable way to schedule spark application > >> jar. > >> >> > >> >> Actually, Flink interactive shell works in similar way internally as > >>far > >> >> as > >> >> i know. i.e. package compiled class into jar and submit. > >> >> > >> >> One idea for prototyping is, > >> >> How about make a interpreter inside of spark interpreter group, say > >>it's > >> >> %spark.build or some better name. > >> >> > >> >> And if user runs some command like > >> >> > >> >> %spark.build > >> >> package > >> >> > >> >> then it builds spark application jar based on spark paragraph in the > >> Note. > >> >> I think it can be the simplest user interface for the prototype. > >> >> > >> >> Thanks, > >> >> moon > >> >> > >> >> On Fri, Sep 16, 2016 at 1:11 PM Jeremy Anderson < > >> >> jer...@objectadjective.com> > >> >> wrote: > >> >> > >> >> > Luciano, I think this would be a terrific feature. I've heard the > >> exact > >> >> > same workflow you've describe in all of the research we've done. > >> >> > > >> >> > ........................... > >> >> > > >> >> > Jeremy Anderson > >> >> > Founder, Object Adjective > >> >> > 415.493.8489 > >> >> > jer...@objectadjective.com > >> >> > objectadjective.com <http://about.me/jeremyanderson> > >> >> > > >> >> > > >> >> > > >> >> > This email and any files transmitted with it are confidential and > >> >> > intended solely for the use of the individual or entity to whom > >>they > >> are > >> >> > addressed. > >> >> > > >> >> > On 16 September 2016 at 12:19, Luciano Resende > >><luckbr1...@gmail.com> > >> >> > wrote: > >> >> > > >> >> > > While talking with a few different users, I have been seeing the > >>use > >> >> case > >> >> > > of using iterative development in Notebooks or Spark Shell and > >>then > >> >> > copying > >> >> > > and pasting the final solution to a formal application repeating > >> >> itself > >> >> > > very often. > >> >> > > > >> >> > > I was wondering if an "Export Spark Paragraphs as a Spark > >> Application > >> >> > > (jar)" would be a feature that Zeppelin community would think > >>it's > >> >> > useful. > >> >> > > But keep in mind there are some limitation here : we would be > >> >> constrained > >> >> > > to Spark related paragraphs, etc... but even so, I think there > >>are > >> >> > > multiple scenarios where I see that the ability to have an > >> application > >> >> > that > >> >> > > directly runs on Spark to be very useful. > >> >> > > > >> >> > > If the community is interested, let's use this thread to discuss > >>any > >> >> > > specific requirements or suggestions that others might have, and > >> >> after a > >> >> > > few days I would like to start prototyping this functionality. > >> >> > > > >> >> > > Thoughts ? > >> >> > > > >> >> > > > >> >> > > > >> >> > > -- > >> >> > > Luciano Resende > >> >> > > http://twitter.com/lresende1975 > >> >> > > http://lresende.blogspot.com/ > >> >> > > > >> >> > > >> >> > >> > > >> > > >> > > > > > > > >-- > >Luciano Resende > >http://twitter.com/lresende1975 > >http://lresende.blogspot.com/ > > -- Luciano Resende http://twitter.com/lresende1975 http://lresende.blogspot.com/