Ok, then we need another trick. let's have an *implicit lazy var connection/context* around our code. And setup() will trigger the eval and initialization.
The implicit lazy val/var trick is actually invented by Kevin. :) Jianshi On Fri, Nov 14, 2014 at 1:41 PM, Cheng Lian <lian.cs....@gmail.com> wrote: > If you’re just relying on the side effect of setup() and cleanup() then > I think this trick is OK and pretty cleaner. > > But if setup() returns, say, a DB connection, then the map(...) part and > cleanup() can’t get the connection object. > > On 11/14/14 1:20 PM, Jianshi Huang wrote: > > So can I write it like this? > > rdd.mapPartition(i => setup(); i).map(...).mapPartition(i => cleanup(); > i) > > So I don't need to mess up the logic and still can use map, filter and > other transformations for RDD. > > Jianshi > > On Fri, Nov 14, 2014 at 12:20 PM, Cheng Lian <lian.cs....@gmail.com> > wrote: > >> If you’re looking for executor side setup and cleanup functions, there >> ain’t any yet, but you can achieve the same semantics via >> RDD.mapPartitions. >> >> Please check the “setup() and cleanup” section of this blog from Cloudera >> for details: >> http://blog.cloudera.com/blog/2014/09/how-to-translate-from-mapreduce-to-apache-spark/ >> >> On 11/14/14 10:44 AM, Dai, Kevin wrote: >> >> HI, all >> >> >> >> Is there setup and cleanup function as in hadoop mapreduce in spark which >> does some initialization and cleanup work? >> >> >> >> Best Regards, >> >> Kevin. >> >> >> > > > > -- > Jianshi Huang > > LinkedIn: jianshi > Twitter: @jshuang > Github & Blog: http://huangjs.github.com/ > > > -- Jianshi Huang LinkedIn: jianshi Twitter: @jshuang Github & Blog: http://huangjs.github.com/