Re: Benchmark results between Flink and Spark

2015-07-06 Thread Hawin Jiang
Hi Stephan Yes. You are correct. It looks like the TPCx-HS is an industry standard for big data. But how to get a Flink number on that. I think it is also difficult to get a Spark performance number based on TPCx-HS. if you know someone can provide servers for performance testing. I would like t

Re: Benchmark results between Flink and Spark

2015-07-06 Thread Slim Baltagi
Hi Vasia, thanks for sharing. 1. I would like to add a couple resources about *BigBench*, the Big Data benchmark suite that you are referring to: https://github.com/intel-hadoop/Big-Data-Benchmark-for-Big-Bench and also http://blog.cloudera.com/blog/2014/11/bigbench-toward-an-industry-standar

Re: Benchmark results between Flink and Spark

2015-07-06 Thread Vasiliki Kalavri
Hi, Apart from the amplab benchmark, you might also find [1] and [2] interesting. The first is a survey on existing benchmarks, while the second proposes one. However, they are also limited to SQL-like queries. Regarding graph processing benchmarks, I recently came across Graphalytics [3]. The be

Re: Benchmark results between Flink and Spark

2015-07-06 Thread Slim Baltagi
Hi Hawin What you shared is not 'the Spark benchmark'. This benchmark measures response time on a handful of relational queries of different tools including Shark. Shark development was ended a year ago on July 1, 2014 in favor of Spark SQL which graduated from an alpha project on March 13, 2015.

Re: Benchmark results between Flink and Spark

2015-07-06 Thread Stephan Ewen
Hi Hawin! The benchmark you refer to is a more or less pure SQL benchmark. For systems that are designed for exactly the "beyond SQL" applications (streaming, iterative algorithms, UDFs, ...), this benchmark is probably not very meaningful, as it covers not one of these areas. Even in the SQL an

Re: Benchmark results between Flink and Spark

2015-07-06 Thread Hawin Jiang
Hi Slim and Fabian Here is the Spark benchmark. https://amplab.cs.berkeley.edu/benchmark/ Do we have s similar report or comparison like that. Thanks. Best regards Hawin On Mon, Jul 6, 2015 at 6:32 AM, Slim Baltagi wrote: > Hi Fabian > > > I could not find which versions of Flink and Spar

RE: Benchmark results between Flink and Spark

2015-07-06 Thread Wang, Yanping
[mailto:fhue...@gmail.com] Sent: Sunday, July 05, 2015 10:18 AM To: user@flink.apache.org Subject: Re: Benchmark results between Flink and Spark Thanks for sharing, Slim! I had a look at the report (except for two pages which were not available in the preview). It compares four different tasks on a

Re: Benchmark results between Flink and Spark

2015-07-06 Thread Slim Baltagi
Hi Fabian > I could not find which versions of Flink and Spark were compared. According to Norman Spangenberg, one of the authors of the conference paper, the benchmark used *Spark* version was *1.2.0*. and *Flink* version was *0.8.0*. I did ask him a few more questions about the benchmark betwee

Re: Benchmark results between Flink and Spark

2015-07-05 Thread Fabian Hueske
Thanks for sharing, Slim! I had a look at the report (except for two pages which were not available in the preview). It compares four different tasks on a setup with 4 rather small nodes (8 cores, 16GB memory). I could not find which versions of Flink and Spark were compared. The comparison tasks

Re: Benchmark results between Flink and Spark

2015-07-05 Thread Stephan Ewen
Hi Slim! Thank you for the link. Unfortunately, I cannot access the contents. I always get a "connection closed" error. Anybody else experiences something similar? Stephan On Sun, Jul 5, 2015 at 6:37 PM, Slim Baltagi wrote: > Hi > > Apache Flink outperforms Apache Spark in processing machin