Wrap an RDD with a ShuffledRDD

2015-11-08 Thread Muhammad Haseeb Javed
I am working on a modified Spark core and have a Broadcast variable which I deserialize to obtain an RDD along with its set of dependencies, as is done in ShuffleMapTask, as following: val taskBinary: Broadcast[Array[Byte]]var (rdd, dep) = ser.deserialize[(RDD[_], ShuffleDependency[_, _, _])](

Communication between executors and drivers

2015-09-16 Thread Muhammad Haseeb Javed
How do executors communicate with the driver in Spark ? I understand that it s done using Akka actors and messages are exchanged as CoarseGrainedSchedulerMessage, but I'd really appreciate if someone could explain the entire process in a bit detail.

Re: Switch from Sort based to Hash based shuffle

2015-08-15 Thread Muhammad Haseeb Javed
Thanks guys, that did it. On Thu, Aug 13, 2015 at 6:49 PM, Akhil Das wrote: > Have a look at spark.shuffle.manager, You can switch between sort and hash > with this configuration. > > spark.shuffle.managersortImplementation to use for shuffling data. There > are two implementations available:sor