Ok, then we need another trick.

let's have an *implicit lazy var connection/context* around our code. And
setup() will trigger the eval and initialization.

The implicit lazy val/var trick is actually invented by Kevin. :)

Jianshi

On Fri, Nov 14, 2014 at 1:41 PM, Cheng Lian <lian.cs....@gmail.com> wrote:

>  If you’re just relying on the side effect of setup() and cleanup() then
> I think this trick is OK and pretty cleaner.
>
> But if setup() returns, say, a DB connection, then the map(...) part and
> cleanup() can’t get the connection object.
>
> On 11/14/14 1:20 PM, Jianshi Huang wrote:
>
>   So can I write it like this?
>
>  rdd.mapPartition(i => setup(); i).map(...).mapPartition(i => cleanup();
> i)
>
>  So I don't need to mess up the logic and still can use map, filter and
> other transformations for RDD.
>
>  Jianshi
>
> On Fri, Nov 14, 2014 at 12:20 PM, Cheng Lian <lian.cs....@gmail.com>
> wrote:
>
>>  If you’re looking for executor side setup and cleanup functions, there
>> ain’t any yet, but you can achieve the same semantics via
>> RDD.mapPartitions.
>>
>> Please check the “setup() and cleanup” section of this blog from Cloudera
>> for details:
>> http://blog.cloudera.com/blog/2014/09/how-to-translate-from-mapreduce-to-apache-spark/
>>
>> On 11/14/14 10:44 AM, Dai, Kevin wrote:
>>
>>  HI, all
>>
>>
>>
>> Is there setup and cleanup function as in hadoop mapreduce in spark which
>> does some initialization and cleanup work?
>>
>>
>>
>> Best Regards,
>>
>> Kevin.
>>
>>  ​
>>
>
>
>
>  --
> Jianshi Huang
>
> LinkedIn: jianshi
> Twitter: @jshuang
> Github & Blog: http://huangjs.github.com/
>
>   ​
>



-- 
Jianshi Huang

LinkedIn: jianshi
Twitter: @jshuang
Github & Blog: http://huangjs.github.com/

Reply via email to