Re: RDD immutablility

2016-01-19 Thread Dave
Thanks Sean. On 19/01/16 13:36, Sean Owen wrote: It's a good question. You can easily imagine an RDD of classes that are mutable. Yes, if you modify these objects, the result is pretty undefined, so don't do that. On Tue, Jan 19, 2016 at 12:27 PM, Dave wrote: Hi Marco, Yes, that answers my q

Re: RDD immutablility

2016-01-19 Thread Sean Owen
It's a good question. You can easily imagine an RDD of classes that are mutable. Yes, if you modify these objects, the result is pretty undefined, so don't do that. On Tue, Jan 19, 2016 at 12:27 PM, Dave wrote: > Hi Marco, > > Yes, that answers my question. I just wanted to be sure as the API gav

Re: RDD immutablility

2016-01-19 Thread Marco
It depends on what you mean by "write access". The RDDs are immutable, so you can't really change them. When you apply a mapping/filter/groupBy function, you are creating a new RDD starting from the original one. Kind regards, Marco 2016-01-19 13:27 GMT+01:00 Dave : > Hi Marco, > > Yes, that an

Re: RDD immutablility

2016-01-19 Thread Dave
Hi Marco, Yes, that answers my question. I just wanted to be sure as the API gave me write access to the immutable data which means its up to the developer to know not to modify the input parameters for these API's. Thanks for the response. Dave. On 19/01/16 12:25, Marco wrote: Hello, RDD

Re: RDD immutablility

2016-01-19 Thread Marco
Hello, RDD are immutable by design. The reasons, to quote Sean Owen in this answer ( https://www.quora.com/Why-is-a-spark-RDD-immutable ), are the following : Immutability rules out a big set of potential problems due to updates from > multiple threads at once. Immutable data is definitely safe t