Re: Create/shutdown objects before/after RDD use (or: Non-serializable classes)

2014-06-01 Thread Tobias Pfeiffer
Xiangrui, thanks for your suggestion! On Sat, May 31, 2014 at 6:12 PM, Xiangrui Meng wrote: > One hack you can try is: > > rdd.mapPartitions(iter => { > val x = new X() > iter.map(row => x.doSomethingWith(row)) ++ { x.shutdown(); Iterator.empty } > }) In fact, I employed a similar hack by n

Re: Create/shutdown objects before/after RDD use (or: Non-serializable classes)

2014-05-31 Thread Xiangrui Meng
Hi Tobias, One hack you can try is: rdd.mapPartitions(iter => { val x = new X() iter.map(row => x.doSomethingWith(row)) ++ { x.shutdown(); Iterator.empty } }) Best, Xiangrui On Thu, May 29, 2014 at 11:38 PM, Tobias Pfeiffer wrote: > Hi, > > I want to use an object x in my RDD processing as

Create/shutdown objects before/after RDD use (or: Non-serializable classes)

2014-05-29 Thread Tobias Pfeiffer
Hi, I want to use an object x in my RDD processing as follows: val x = new X() rdd.map(row => x.doSomethingWith(row)) println(rdd.count()) x.shutdown() Now the problem is that X is non-serializable, so while this works locally, it does not work in cluster setup. I thought I could do rdd.mapPart