Sorry to jump in… If you want to run paragraphs in parallel, you are going to want to have some sort of dependency graph. Think of a common set up where you need to set up common functions and imports. (setup of %spark.dep)
A good example is if your notebook is a bunch of unit tests and you need to build the common tear down / set up methods to be used by the other paragraphs. If you’re going to do that, you’ll need to build out a metadata structure where you can set up your dependencies as well as add things like labels beyond the ids (which only need to be unique to the given notebook. ) Just my $0.02 On Sep 29, 2017, at 1:30 PM, moon soo Lee <m...@apache.org<mailto:m...@apache.org>> wrote: Current behavior is as parallel as possible. Run notebook button currently submits all paragraphs in a notebook into each interpreter's own scheduler (FIFO, Parallel) at once. And each individual scheduler of interpreter runs the paragraphs. I think we can provide "sequential" run button for easier use, which submits paragraph one and waits for finish before submit next paragraphs. And I think sequential run button doesn't stop having more complex / flexible DAG in the future? Thanks, moon On Fri, Sep 29, 2017 at 10:08 AM Mohit Jaggi <mohitja...@gmail.com<mailto:mohitja...@gmail.com>> wrote: What is the current behavior? On Fri, Sep 29, 2017 at 6:56 AM, Herval Freire <hfre...@twitter.com<mailto:hfre...@twitter.com>> wrote: At least in our case, the notebooks that we need to run sequentially are expected to *always* run sequentially - thus it makes more sense to be a note option than a per-run mode H _____________________________ From: moon soo Lee <m...@apache.org<mailto:m...@apache.org>> Sent: Thursday, September 28, 2017 9:03 PM Subject: Re: Implementing run all paragraphs sequentially To: <users@zeppelin.apache.org<mailto:users@zeppelin.apache.org>> This is going to be really useful! Curios why do you prefer 'note option' instead of 'run option'? Could you compare their pros and cons? Thanks, moon On Thu, Sep 28, 2017 at 8:32 AM Herval Freire <hfre...@twitter.com<mailto:hfre...@twitter.com>> wrote: +1, our internal users at Twitter also often request this ________________________________ From: Belousov Maksim Eduardovich <m.belou...@tinkoff.ru<mailto:m.belou...@tinkoff.ru>> Sent: Thursday, September 28, 2017 8:28:58 AM To: users@zeppelin.apache.org<mailto:users@zeppelin.apache.org> Subject: Implementing run all paragraphs sequentially Hello, users! At the moment our analysts often use mixes of interpreters in their notes. For example, they prepare data using %jdbc and then use it in %pyspark. Besides, they often use scheduling to make some regular reporting. And they should do something like `time.sleep()` to wait for the data from %jdbc. It doesn`t guarantee the result and doesn`t look cool. You can find early attempts to implement sequential running of all paragraphs in [1]. We are really interested in implementation of the issue [2] and are ready to solve it. It seems a good idea to discuss any requirements. My idea is to introduce note setting that defines the type of running to use (parallel or sequential) and leave "Run all" to be the only button running all the cells in the note. This will make sequential or parallel running the `note option` but not `run option`. Option will be controlled by nearby button as shown [image002.jpg] For new notes the default state would be "Run sequential all", for old - "Run parallel for interpreters" We are glad to hear any thoughts. Thank you. [1] https://issues.apache.org/jira/browse/ZEPPELIN-1165 [2] https://issues.apache.org/jira/browse/ZEPPELIN-2368 Maksim Belousov