Fwd: Task closures and synchronization

2014-08-12 Thread Tobias Pfeiffer
Uh, for some reason I don't seem to automatically reply to the list any more. Here is again my message to Tom. -- Forwarded message -- Tom, On Wed, Aug 13, 2014 at 5:35 AM, Tom Vacek wrote: > This is a back-to-basics question. How do we know when Spark will clone > an object a

Task closures and synchronization

2014-08-12 Thread Tom Vacek
This is a back-to-basics question. How do we know when Spark will clone an object and distribute it with task closures versus synchronize access to it. For example, the old rookie mistake of random number generation: import scala.util.Random val randRDD = sc.parallelize(0 until 1000).map(ii => R