Uh, for some reason I don't seem to automatically reply to the list any
more.
Here is again my message to Tom.
-- Forwarded message --
Tom,
On Wed, Aug 13, 2014 at 5:35 AM, Tom Vacek wrote:
> This is a back-to-basics question. How do we know when Spark will clone
> an object a
This is a back-to-basics question. How do we know when Spark will clone an
object and distribute it with task closures versus synchronize access to it.
For example, the old rookie mistake of random number generation:
import scala.util.Random
val randRDD = sc.parallelize(0 until 1000).map(ii => R