Here are a few things I noticed from the 1.17 release retrospectively which I want to share (other release managers might have a different view or might disagree):
- Google Meet might not be the best choice for the release sync. We need to be able to invite attendees even if the creator of the meeting isn't available (maybe try Zoom or jitsi instead?) - Release sync every 2 weeks and a switch to weekly after feature freeze felt reasonable - Slack worked well as a collaboration tool to document the monitoring tasks (#builds [1], #flink-dev-benchmarks [2]) in a team with multiple release managers - The Slack Azure Pipeline bot seems to be buggy. It swallows some build failures. It's not a severe issue, though. We created #builds-debug [3] to monitor whether it's happening consistently. The issue is covered in FLINK-30733 [4]. - Having dedicated people for monitoring the build failures helps getting a more consistent picture of test instabilities - We experienced occasional issues in the manual steps of the release creation in the past (e.g. japicmp config was not properly pushed). Creating Jira issues for the release helped to make the release creation more transparent and made the steps more reviewable [5][6][7][8]. Additionally, it helped to distribute subtasks to different people with Jira being the tool for documentation and synchronization. That's especially helpful when there is more than one person in charge of creating the release. - We had backports/merges without PRs happening by committers occasionally during the 1.17 release which broke master/release branches (probably, changes were done locally before merging which were not part of the PR to have a faster backport experience). It might make sense to remind everyone that this should be avoided. Not sure whether we want/can restrict that. - We observed a good response on fixing test instabilities by the end of the release cycle but had some long running issues earlier in the cycle which caused extra efforts on the release managers due to reoccurring test failures. - Release testing picked up “slowly”: Initially, we planned 2 weeks for release testing. But there was not really any progress (tickets being created and worked on) in the first week. In the end, we had to extend the phase by another week resulting in 3 instead of 2 weeks of release testing. I guess we could encourage the community to create release testing tasks earlier and label them properly to be able to monitor the effort. That would even enable us to do release testing for a certain feature after the feature is done and not necessarily only at the end of the release cycle. - Manual test data generation is tedious (FLINK-31593 [9]). But this should be fixed in 1.18 with FLINK-27518 [10] being almost done. - We started creating documentation for release management [11]. The goal is to collect what tasks are there to help support a Flink release to encourage newcomers to pick up the task. I'm going to add these to the Flink 1.17 release documentation [12] as feedback as well. Best, Matthias [1] https://apache-flink.slack.com/archives/C03MR1HQHK2 [2] https://apache-flink.slack.com/archives/C0471S0DFJ9 [3] https://apache-flink.slack.com/archives/C04LZM3EE9E [4] https://issues.apache.org/jira/browse/FLINK-30733 [5] https://issues.apache.org/jira/browse/FLINK-31146 [6] https://issues.apache.org/jira/browse/FLINK-31154 [7] https://issues.apache.org/jira/browse/FLINK-31562 [8] https://issues.apache.org/jira/browse/FLINK-31567 [9] https://issues.apache.org/jira/browse/FLINK-31593 [10] https://issues.apache.org/jira/browse/FLINK-27518 [11] https://cwiki.apache.org/confluence/display/FLINK/Flink+Release+Management [12] https://cwiki.apache.org/confluence/display/FLINK/1.17+Release On Sat, Mar 25, 2023 at 8:29 AM Hang Ruan <ruanhang1...@gmail.com> wrote: > Thanks for the great work ! Congrats all! > > Best, > Hang > > Panagiotis Garefalakis <pga...@apache.org> 于2023年3月25日周六 03:22写道: > >> Congrats all! Well done! >> >> Cheers, >> Panagiotis >> >> On Fri, Mar 24, 2023 at 2:46 AM Qingsheng Ren <renqs...@gmail.com> wrote: >> >> > I'd like to say thank you to all contributors of Flink 1.17. Your >> support >> > and great work together make this giant step forward! >> > >> > Also like Matthias mentioned, feel free to leave us any suggestions and >> > let's improve the releasing procedure together. >> > >> > Cheers, >> > Qingsheng >> > >> > On Fri, Mar 24, 2023 at 5:00 PM Etienne Chauchot <echauc...@apache.org> >> > wrote: >> > >> >> Congrats to all the people involved! >> >> >> >> Best >> >> >> >> Etienne >> >> >> >> Le 23/03/2023 à 10:19, Leonard Xu a écrit : >> >> > The Apache Flink community is very happy to announce the release of >> >> Apache Flink 1.17.0, which is the first release for the Apache Flink >> 1.17 >> >> series. >> >> > >> >> > Apache Flink® is an open-source unified stream and batch data >> >> processing framework for distributed, high-performing, >> always-available, >> >> and accurate data applications. >> >> > >> >> > The release is available for download at: >> >> > https://flink.apache.org/downloads.html >> >> > >> >> > Please check out the release blog post for an overview of the >> >> improvements for this release: >> >> > >> >> >> https://flink.apache.org/2023/03/23/announcing-the-release-of-apache-flink-1.17/ >> >> > >> >> > The full release notes are available in Jira: >> >> > >> >> >> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12351585 >> >> > >> >> > We would like to thank all contributors of the Apache Flink community >> >> who made this release possible! >> >> > >> >> > Best regards, >> >> > Qingsheng, Martijn, Matthias and Leonard >> >> >> > >> >