Why not use an existing benchmarking tool -- is there one? Perhaps you'd like to build something like YCSB [0] but for streaming workloads?
Apache Storm is the OSS framework that's been around the longest. Search for "apache storm benchmark" and you'll get some promising hits. Looks like IBMStreams has a tool [1] and the Ericsson research blog has a detailed post [2] as well. [0]: https://github.com/brianfrankcooper/YCSB [1]: https://github.com/IBMStreams/benchmarks/wiki/Running-Apache-Storm-benchmark [2]: http://www.ericsson.com/research-blog/data-knowledge/trident-benchmarking-performance/ On Mon, Nov 16, 2015 at 6:21 AM, Vasiliki Kalavri <vasilikikala...@gmail.com > wrote: > Hello squirrels, > > with some colleagues and students here at KTH, we have started 2 projects > to evaluate (1) performance and (2) behavior in the presence of memory > interference in cloud environments, for Flink and other systems. We want to > provide our students with a workload of representative applications for > testing. > > While for batch applications, it is quite clear to us what classes of > applications are widely used and how to create a workload of different > types of applications, we are not quite sure about the streaming workload. > > That's why, we'd like your opinions! If you're using Flink streaming in > your company or your project, we'd love your input even more :-) > > What kind of applications would you consider as "representative" of a > streaming workload? Have you run any experiments to evaluate Flink versus > Spark, Storm etc.? If yes, would you mind sharing your code with us? > > We will of course be happy to share our results with everyone after we > have completed our study. > > Thanks a lot! > -Vasia. >