subject:"Re\: Apache Spark Contribution"

Re: Apache Spark Contribution

2017-02-03 Thread Steve Loughran

You might want to look at Nephele: Efficient Parallel Data Processing in the Cloud, Warneke & Kao, 2009 http://stratosphere.eu/assets/papers/Nephele_09.pdf This was some of the work done in the research project with gave birth to Flink, though this bit didn't surface as they chose to leave VM a

Re: Apache Spark Contribution

2017-02-02 Thread Shuai Lin

> > The goal of the project is to develop an algorithm that automatically > scales the cluster up and down based on the volume of data processed by the > application. By "scale the cluster up and down" do you mean: 1) adding/removing spark executors based on the load? How is that from the dynami