Hi Dong, The main issue with an automatic tool at the moment is that some benchmarks are quite noisy and performance regressions are often within the noise of a given benchmark. Our currently existing tooling can not handle those cases. Until we address this issue, I think it will have to remain a manual process. There is a ticket mentioned by Yuan [1] where I have written a comment and a proposal on how to improve the automatic performance regression detection.
Best, Piotrek [1] https://issues.apache.org/jira/browse/FLINK-29825?focusedCommentId=17679077&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17679077 pon., 30 sty 2023 o 15:31 Dong Lin <lindon...@gmail.com> napisał(a): > Hi Yanfei, > > Thanks for driving the benchmark monitoring effort! The Google doc and the > community wiki looks pretty good. > > According to Yuan's comment, it seems that we currently manually watch the > benchmark results to detect regression. Have we considered automating this > process by e.g. exporting the nightly benchmark results to a database and > using scripts to detect regression based on pre-defined rules? > > This approach is probably more scalable and accurate in the long term. And > I had a good experience working with such a regression detection tool in my > past job. > > Thanks, > Dong > > > > On Thu, Jan 19, 2023 at 4:02 PM Yanfei Lei <fredia...@gmail.com> wrote: > > > Hi devs, > > > > I'd like to start a discussion about incorporating performance > > regression monitoring into the routine process. Flink benchmarks are > > periodically executed on http://codespeed.dak8s.net:8080 to monitor > > Flink performance. In late Oct'22, a new slack channel > > #flink-dev-benchmarks was created for notifications of performance > > regressions. It helped us find 2 build failures[1,2] and 5 performance > > regressions[3,4,5,6,7] in the past 3 months, which is very meaningful > > to ensuring the quality of the code. > > > > There are some release managers( cc @Matthias, @Martijn, @Qingsheng) > > proposing to incorporate performance regression monitoring into the > > release management, I think it makes sense for performance stabilities > > (like CI stabilities), since almost every release has some tickets > > about performance optimizations, the performance monitoring can > > effectively avoid performance regression and track the performance > > improvement of each release. So I start this discussion to pick > > everyone’s brain for some suggestions. > > > > In the past, I checked the slack notifications once a week, and I have > > summarized a draft[8]( > > > https://docs.google.com/document/d/1jTTJHoCTf8_LAjviyAY3Fi7p-tYtl_zw7rJKV4V6T_c/edit?usp=sharing > > ) > > on how to deal with performance regressions according to some > > contributors and my own experience. If the above proposal is > > considered acceptable, I’d like to put it in the community wiki[9]. > > > > Looking forward to your feedback! > > > > [1] https://issues.apache.org/jira/browse/FLINK-29883 > > [2] https://issues.apache.org/jira/browse/FLINK-30015 > > [3] https://issues.apache.org/jira/browse/FLINK-29886 > > [4] https://issues.apache.org/jira/browse/FLINK-30181 > > [5] https://issues.apache.org/jira/browse/FLINK-30623 > > [6] https://issues.apache.org/jira/browse/FLINK-30624 > > [7] https://issues.apache.org/jira/browse/FLINK-30625 > > [8] > > > https://docs.google.com/document/d/1jTTJHoCTf8_LAjviyAY3Fi7p-tYtl_zw7rJKV4V6T_c/edit?usp=sharing > > [9] > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=115511847 > > > > Best, > > Yanfei > > >