Thanks Vadim & Jörn... I will look into those. jg
> On Jun 20, 2017, at 2:12 PM, Vadim Semenov <vadim.seme...@datadoghq.com> > wrote: > > You can launch one permanent spark context and then execute your jobs within > the context. And since they'll be running in the same context, they can share > data easily. > > These two projects provide the functionality that you need: > https://github.com/spark-jobserver/spark-jobserver#persistent-context-mode---faster--required-for-related-jobs > > <https://github.com/spark-jobserver/spark-jobserver#persistent-context-mode---faster--required-for-related-jobs> > https://github.com/cloudera/livy#post-sessions > <https://github.com/cloudera/livy#post-sessions> > > On Tue, Jun 20, 2017 at 1:46 PM, Jean Georges Perrin <j...@jgp.net > <mailto:j...@jgp.net>> wrote: > Hey, > > Here is my need: program A does something on a set of data and produces > results, program B does that on another set, and finally, program C combines > the data of A and B. Of course, the easy way is to dump all on disk after A > and B are done, but I wanted to avoid this. > > I was thinking of creating a temp view, but I do not really like the temp > aspect of it ;). Any idea (they are all worth sharing) > > jg > > > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > <mailto:user-unsubscr...@spark.apache.org> > >