As far as I understand, best way to generate seeded random numbers in Spark
is to use mapPartititons with a seeded Random instance for each partition.
But graph.pregel in GraphX does not have anything similar to mapPartitions.
Can something like this be done in GraphX Pregel API?
Hi,
After switching from Spark 0.8.0 to Spark 0.9.0 (and to Scala 2.10) one
application started hanging after main thread is done (in 'local[2]' mode,
without a cluster).
Adding SparkContext.stop() at the end solves this.
Is this behavior normal and shutting down of SparkContext is required?
Hi.
We're thinking about writing a tool that would read Spark logs and output
cache contents at some point in time (e.g. if you want to see what data
fills the cache and whether some of it may be unpersisted to improve
performance).
Are there similar projects that already exist? Is there a list o