from:"Quang\-Nhat HOANG\-XUAN"

Re: Scan Sharing in Spark

2015-05-05 Thread Quang-Nhat HOANG-XUAN

ay(1,2,3,4,5,6,7)) > > //Apply this combine function to each of your data elements. > val res = data.map(sharedF) > > res.take(5) > > The result will look something like this. > > res5: Array[Seq[Int]] = Array(List(2, 5), List(3, 10), List(4, 15), > List(5, 20), List(6, 25))

Scan Sharing in Spark

2015-05-05 Thread Quang-Nhat HOANG-XUAN

Hi everyone, I have two Spark jobs inside a Spark Application, which read from the same input file. They are executed in 2 threads. Right now, I cache the input file into memory before executing these two jobs. Are there another ways to share their same input with just only one read? I know ther