After hitting pause on the 1.15.1 release a couple of days ago because of FLINK-28060 [1], I want to hit resume now. You should go read that ticket if you want the details, but the summary is that the upgrade of the Kafka Clients to 2.8.1 that was done in Flink 1.15.0 is causing a bug that means that after a Kafka broker restarts, subsequent attempts by Flink to commit Kafka offsets will fail until Flink is restarted. The only good fix available to us appears to be upgrading the Kafka Clients to version 3.1.1.
It doesn't seem wise to upgrade the Kafka Clients from 2.8.1 to 3.1.1 for Flink 1.15.1. That's a big change to make this close to our release, and we shouldn't delay this release any further. So here's a proposal: - We bump the Kafka Clients to 3.1.1 in master now. - We don't try to fix FLINK-28060 for 1.15.1. - We create the Flink 1.15.1 release straight away, noting that there's a known issue with Kafka (FLINK-28060). - We reach out to the Kafka community to see if they're willing to create a 2.8.2 release with a backport of the fix for this bug. - In parallel, we merge a bump of the Kafka Clients to 3.1.1 after 1.15.1 is done, to see how it behaves on the CI for the next few weeks, and plan a quick Flink 1.15.2 release (most likely something like a month later). [1] https://issues.apache.org/jira/browse/FLINK-28060 Best, David On Wed, Jun 15, 2022 at 11:37 AM David Anderson <dander...@apache.org> wrote: > I'm now thinking we should delay 1.15.1 long enough to see if we can > include a fix for FLINK-28060 [1], which is a serious regression affecting > several Kafka users. > > [1] https://issues.apache.org/jira/browse/FLINK-28060 > > On Fri, Jun 10, 2022 at 12:15 PM David Anderson <dander...@apache.org> > wrote: > >> Since no one has brought up any blockers, I'll plan to start the release >> process on Monday unless I hear otherwise. >> >> Best, >> David >> >> On Thu, Jun 9, 2022 at 10:20 AM Yun Gao <yungao...@aliyun.com.invalid> >> wrote: >> >>> Hi David, >>> >>> Very thanks for driving the new version, also +1 since we already >>> accumulated some fixes. >>> >>> Regarding https://issues.apache.org/jira/browse/FLINK-27492, currently >>> there are still some >>> controversy with how to deal with the artifacts. I also agree we may not >>> hold up the release >>> with this issue. We'll try to reach to the consensus as soon as possible >>> to try best catching >>> up with the release. >>> >>> Best, >>> Yun >>> >>> >>> >>> ------------------------------------------------------------------ >>> From:LuNing Wang <wang4lun...@gmail.com> >>> Send Time:2022 Jun. 9 (Thu.) 10:10 >>> To:dev <dev@flink.apache.org> >>> Subject:Re: [DISCUSS] Releasing 1.15.1 >>> >>> Hi David, >>> >>> +1 >>> Thank you for driving this. >>> >>> Best regards, >>> LuNing Wang >>> >>> Jing Ge <j...@ververica.com> 于2022年6月8日周三 20:45写道: >>> >>> > +1 >>> > >>> > Thanks David for driving it! >>> > >>> > Best regards, >>> > Jing >>> > >>> > >>> > On Wed, Jun 8, 2022 at 2:32 PM Xingbo Huang <hxbks...@gmail.com> >>> wrote: >>> > >>> > > Hi David, >>> > > >>> > > +1 >>> > > Thank you for driving this. >>> > > >>> > > Best, >>> > > Xingbo >>> > > >>> > > Chesnay Schepler <ches...@apache.org> 于2022年6月8日周三 18:41写道: >>> > > >>> > > > +1 >>> > > > >>> > > > Thank you for proposing this. I can take care of the PMC-side of >>> > things. >>> > > > >>> > > > On 08/06/2022 12:37, Jingsong Li wrote: >>> > > > > +1 >>> > > > > >>> > > > > Thanks David for volunteering to manage the release. >>> > > > > >>> > > > > Best, >>> > > > > Jingsong >>> > > > > >>> > > > > On Wed, Jun 8, 2022 at 6:21 PM Jark Wu <imj...@gmail.com> wrote: >>> > > > >> Hi David, thank you for driving the release. >>> > > > >> >>> > > > >> +1 for the 1.15.1 release. The release-1.15 branch >>> > > > >> already contains many bug fixes and some SQL >>> > > > >> issues are quite critical. >>> > > > >> >>> > > > >> Btw, FLINK-27606 has been merged just now. >>> > > > >> >>> > > > >> Best, >>> > > > >> Jark >>> > > > >> >>> > > > >> >>> > > > >> On Wed, 8 Jun 2022 at 17:40, David Anderson < >>> dander...@apache.org> >>> > > > wrote: >>> > > > >> >>> > > > >>> I would like to start a discussion on releasing 1.15.1. Flink >>> 1.15 >>> > > was >>> > > > >>> released on the 5th of May [1] and so far 43 issues have been >>> > > resolved, >>> > > > >>> including several user-facing issues with blocker and critical >>> > > > priorities >>> > > > >>> [2]. (The recent problem with FileSink rolling policies not >>> working >>> > > > >>> properly in 1.15.0 got me thinking it might be time for bug-fix >>> > > > release.) >>> > > > >>> >>> > > > >>> There currently remain 16 unresolved tickets with a fixVersion >>> of >>> > > > 1.15.1 >>> > > > >>> [3], five of which are about CI infrastructure and tests. >>> There is >>> > > > only one >>> > > > >>> such ticket marked Critical: >>> > > > >>> >>> > > > >>> https://issues.apache.org/jira/browse/FLINK-27492 Flink table >>> > scala >>> > > > >>> example >>> > > > >>> does not including the scala-api jars >>> > > > >>> >>> > > > >>> I'm not convinced we should hold up a release for this issue, >>> but >>> > on >>> > > > the >>> > > > >>> other hand, it seems that this issue can be resolved by making >>> a >>> > > > decision >>> > > > >>> about how to handle the missing dependencies. @Timo Walther >>> > > > >>> <twal...@apache.org> @yun_gao can you give an update? >>> > > > >>> >>> > > > >>> Two other open issues seem to have made significant progress >>> > (listed >>> > > > >>> below). Would it make sense to wait for either of these? Are >>> there >>> > > any >>> > > > >>> other open tickets we should consider waiting for? >>> > > > >>> >>> > > > >>> https://issues.apache.org/jira/browse/FLINK-27420 Suspended >>> > > > SlotManager >>> > > > >>> fail to reregister metrics when started again >>> > > > >>> https://issues.apache.org/jira/browse/FLINK-27606 >>> CompileException >>> > > > when >>> > > > >>> using UDAF with merge() method >>> > > > >>> >>> > > > >>> I would volunteer to manage the release. Is there a PMC member >>> who >>> > > > would >>> > > > >>> join me to help, as needed? >>> > > > >>> >>> > > > >>> Best, >>> > > > >>> David >>> > > > >>> >>> > > > >>> [1] >>> > https://flink.apache.org/news/2022/05/05/1.15-announcement.html >>> > > > >>> >>> > > > >>> [2] >>> > > > >>> >>> > > > >>> >>> > > > >>> > > >>> > >>> https://issues.apache.org/jira/issues/?jql=project%20%3D%20FLINK%20AND%20status%20in%20%20(Resolved%2C%20Closed)%20AND%20fixVersion%20%3D%201.15.1%20ORDER%20BY%20priority%20DESC%2C%20created%20DESC >>> > > > >>> >>> > > > >>> [3] >>> > > > >>> >>> > > > >>> >>> > > > >>> > > >>> > >>> https://issues.apache.org/jira/browse/FLINK-27492?jql=project%20%3D%20FLINK%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%20AND%20fixVersion%20%3D%201.15.1%20ORDER%20BY%20priority%20DESC%2C%20created%20DESC >>> > > > >>> >>> > > > >>> > > > >>> > > >>> > >>> >>>