Hi folks, hoping someone can explain to me what's going on:
I have the following code, largely based on RecoverableNetworkWordCount
example (
https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/streaming/RecoverableNetworkWordCount.scala
):
I am setting fields on an object that gets accessed within the map
function. But the workers do not see the set values. Could someone help me
understand what is going on? I suspect the serialization of the object
happens not when I think it does....
object Foo{
var field1 = -1
var field2 = -2
}
def main(args: Array[String]) {
//open a yaml file
Foo.field1 = value_from_yaml
Foo.field2 = value from_yaml
val ssc = StreamingContext.getOrCreate(checkpointDirectory,
() => { createContext() })
ssc.start()
}
def createContext(){
// create streaming context
println(Foo) <--
foo's fields are set
lines.map(line =>{
println(Foo) <--
foo's fields are not set
})
}