I don’t think “benchmarking” frameworks WRT Kafka is a particularly informative. The various frameworks available are better compared WRT their features and processing limitations. For example, Akka-streams for kafka effects a more intuitive way to express asynchronous operations. If you were to benchmark each framework with a simple poll-transformation-publish workload, I think you would find very little difference between them (assuming that they were all configured appropriately…minimum consumer bytes setting for instance). I think each framework would be better evaluated according to it’s features….just my thoughts.
-David On 3/23/17, 9:38 AM, "Eno Thereska" <eno.there...@gmail.com> wrote: Hi Giselle, Great idea! In Kafka Streams we have a few micro-benchmarks we run nightly. They are at: https://github.com/apache/kafka/blob/trunk/streams/src/test/java/org/apache/kafka/streams/perf/SimpleBenchmark.java <https://github.com/apache/kafka/blob/trunk/streams/src/test/java/org/apache/kafka/streams/perf/SimpleBenchmark.java> It's mostly simple stuff (aggregations, joins) and we are continuously updating them and adding more. The nightly results are kept publicly at http://testing.confluent.io/confluent-kafka-system-test-results/ <http://testing.confluent.io/confluent-kafka-system-test-results/>, e.g., see report on 2017-03-21: http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/2017-03-21--001.1490119830--apache--trunk--05690f0/report.html <http://confluent-kafka-system-test-results.s3-us-west-2.amazonaws.com/2017-03-21--001.1490119830--apache--trunk--05690f0/report.html> (search for "simple_benchmark_test"). Your feedback and input is always appreciated. Thanks, Eno > On 23 Mar 2017, at 10:09, Giselle van Dongen <giselle.vandon...@ugent.be> wrote: > > Dear users of Streaming Technologies, > > As a PhD student in big data analytics, I am currently in the process of > compiling a list of benchmarks (to test multiple streaming frameworks) in > order to create an expanded benchmarking suite. The benchmark suite is being > developed as a part of my current work at Ghent University. > > The included frameworks at this time are, in no particular order, Spark, > Flink, Kafka (Streams), Storm (Trident) and Drizzle. Any pointers to > previous work or relevant benchmarks would be appreciated. > > Best regards, > Giselle van Dongen