Re: Spark serialization in closure

2015-07-09 Thread Chen Song
Thanks Andrew. I tried with your suggestions and (2) works for me. (1) still doesn't work. Chen On Thu, Jul 9, 2015 at 4:58 PM, Andrew Or wrote: > Hi Chen, > > I believe the issue is that `object foo` is a member of `object testing`, > so the only way to access `object foo` is to first pull `o

Re: Spark serialization in closure

2015-07-09 Thread Andrew Or
Hi Chen, I believe the issue is that `object foo` is a member of `object testing`, so the only way to access `object foo` is to first pull `object testing` into the closure, then access a pointer to get to `object foo`. There are two workarounds that I'm aware of: (1) Move `object foo` outside of

Re: Spark serialization in closure

2015-07-09 Thread Richard Marscher
Reading that article and applying it to your observations of what happens at runtime: shouldn't the closure require serializing testing? The foo singleton object is a member of testing, and then you call this foo value in the closure func and further in the foreachPartition closure. So following b

Re: Spark serialization in closure

2015-07-09 Thread Chen Song
Repost the code example, object testing extends Serializable { object foo { val v = 42 } val list = List(1,2,3) val rdd = sc.parallelize(list) def func = { val after = rdd.foreachPartition { it => println(foo.v) } } } On Thu, Jul 9, 2015 at 4:09

Re: Spark serialization in closure

2015-07-09 Thread Chen Song
Thanks Erik. I saw the document too. That is why I am confused because as per the article, it should be good as long as *foo *is serializable. However, what I have seen is that it would work if *testing* is serializable, even foo is not serializable, as shown below. I don't know if there is somethi

Re: Spark serialization in closure

2015-07-09 Thread Erik Erlandson
I think you have stumbled across this idiosyncrasy: http://erikerlandson.github.io/blog/2015/03/31/hygienic-closures-for-scala-function-serialization/ - Original Message - > I am not sure this is more of a question for Spark or just Scala but I am > posting my question here. > > The c

Spark serialization in closure

2015-07-09 Thread Chen Song
I am not sure this is more of a question for Spark or just Scala but I am posting my question here. The code snippet below shows an example of passing a reference to a closure in rdd.foreachPartition method. ``` object testing { object foo extends Serializable { val v = 42 } val