Thanks for this contribution Piotr and Nico. Tools like this are really useful for Flink’s success.
Cheers, Kostas > On Sep 21, 2018, at 4:59 PM, Piotr Nowojski <pi...@data-artisans.com> wrote: > > Hello community, > > For almost a year in data Artisans Nico and I were maintaining a setup > that continuously evaluates Flink with benchmarks defined at > https://github.com/dataArtisans/flink-benchmarks > <https://github.com/dataArtisans/flink-benchmarks>. With growing interest > and after proving useful a couple of times, we have finally decided to > publish the web UI layer of this setup. Currently it is accessible via > the following (maybe not so?) temporarily url: > > http://codespeed.dak8s.net:8000 <http://codespeed.dak8s.net:8000/> > > This is a simple web UI to present performance changes over past and > present commits to Apache Flink. It only has a couple of views and the > most useful ones are: > > 1. Timeline > 2. Comparison (I recommend to use normalization) > > Timeline is useful for spotting unintended regressions or unexpected > improvements. It is being updated every six hours. > Comparison is useful for comparing a given branch (for example a pending > PR) with the master branch. More about that later. > > The codespeed project on it’s own is just a presentation layer. As > mentioned before, the only currently available benchmarks are defined in > the flink-benchmarks repository and they are executed periodically or on > demand by Jenkins on a single bare metal machine. The current setup > limits us only to micro benchmarks (they are easier to > setup/develop/maintain and have a quicker feedback loop compared to > cluster benchmarks) but there is no reason preventing us from setting up > other kinds of benchmarks and upload their results to our codespeed > instance as well. > > Regarding the comparison view. Currently data Artisans’ Flink mirror > repository at https://github.com/dataArtisans/flink > <https://github.com/dataArtisans/flink> is configured to > trigger benchmark runs on every commit/change that happens on the > benchmark-request branch (We chose to use dataArtisans' repository here > because we needed a custom GitHub hook that we couldn’t add to the > apache/flink repository). Benchmarking usually takes between one and two > hours. One obvious limitation at the moment is that there is only one > comparison view, with one comparison branch, so trying to compare two > PRs at the same time is impossible. However we can tackle > this problem once it will become a real issue, not only a theoretical one. > > Piotrek & Nico