Given that there might still be users who want to upgrade and, thus, are
reading the release notes for the 1.12.0 release, I would also update the
1.12.0 release notes. Moreover, I would consider the risk of losing state
because unaligned checkpoints might break recovery as quite a serious
problem. Imagine a user is running a production job which does not need to
recover for some time because of some lucky coincidence and then all of a
sudden Flink fails fatally with the first job failure. I would even argue
that such a problem would warrant a documentation update where we add a
warning box to [1] which states the current limitations. Of course, this
only holds true under the assumption that this is indeed a real problem and
not a test instability.

[1]
https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/checkpoints.html#unaligned-checkpoints

Cheers,
Till

On Mon, Dec 28, 2020 at 12:29 PM Xintong Song <tonysong...@gmail.com> wrote:

> Adding it as a warning of known issues to 1.12.1 release notes makes sense
> to me. (If it doesn't get fixed in this release. I'm canceling 1.12.1-rc1
> for another blocker.)
>
> I'm not entirely sure about adding a warning to 1.12.0 release notes. Is it
> how we usually do, adding warnings to release notes for bugs found after
> the release?
> Putting it another way, should we modify the release notes silently after
> they are posted/published?
>
> @Piotr,
> Do we understand in which cases the recovery of unaligned checkpoints can
> lead to a corrupted data stream? Or shall we suggest the users to never use
> unaligned checkpoints for this version? Maybe you or Roman is the better
> person to draft this warning?
>
> Thank you~
>
> Xintong Song
>
>
>
> On Mon, Dec 28, 2020 at 6:10 PM Till Rohrmann <trohrm...@apache.org>
> wrote:
>
> > Alright, thanks for the clarification. Should we issue a warning to not
> use
> > unaligned checkpoints for the time being because it can lead to corrupted
> > data streams on recovery? I can envision that some of our users might be
> > surprised about it. Maybe adding it to the 1.12.0 and 1.12.1 release
> notes?
> >
> > Cheers,
> > Till
> >
> > On Mon, Dec 28, 2020 at 10:50 AM Piotr Nowojski <pnowoj...@apache.org>
> > wrote:
> >
> > > Hi,
> > >
> > > Yes, as Xintong wrote above, I've wrote offline to him:
> > >
> > > > I’m going to remove release blocker status from FLINK-20654. After
> all
> > we
> > > already have released it, at least in 1.12.0, and maybe even sooner.
> > There
> > > is no point from blocking a release (which has probably some important
> > bug
> > > fixes) in that case. It’s not a new bug.
> > >
> > > By "it's not a new bug", I meant that it has already been released in
> > > 1.12.0. Also after ignoring the test for the time being, this bug
> should
> > > not be causing build failures anymore.
> > >
> > > Piotrek
> > >
> > > pon., 28 gru 2020 o 10:29 Xintong Song <tonysong...@gmail.com>
> > napisał(a):
> > >
> > > > Hi Till,
> > > >
> > > > @Piotr and @Roman mentioned offline that FLINK-20648 is not a new bug
> > and
> > > > they don't think we should block a release on it.
> > > >
> > > >
> > > > I guess we should have made the conversation public visible. Sorry
> for
> > > the
> > > > confusion.
> > > >
> > > >
> > > > Thank you~
> > > >
> > > > Xintong Song
> > > >
> > > >
> > > >
> > > > On Mon, Dec 28, 2020 at 4:59 PM Till Rohrmann <trohrm...@apache.org>
> > > > wrote:
> > > >
> > > > > Hi Xintong,
> > > > >
> > > > > quick question, what about FLINK-20654? Previously it was listed
> as a
> > > > > release blocker but has not been fixed yet.
> > > > >
> > > > > Cheers,
> > > > > Till
> > > > >
> > > > > On Fri, Dec 25, 2020 at 10:52 AM Xintong Song <
> tonysong...@gmail.com
> > >
> > > > > wrote:
> > > > >
> > > > > > Hi devs,
> > > > > >
> > > > > > I'm very glad to announce that all known blocker issues for
> > > > > release-1.12.1
> > > > > > have been resolved.
> > > > > >
> > > > > > I'm creating our first release candidate now and will start a
> > > separate
> > > > > > voting thread as soon as RC1 is created.
> > > > > >
> > > > > > Thanks everyone, and Merry Christmas.
> > > > > >
> > > > > > Thank you~
> > > > > >
> > > > > > Xintong Song
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Wed, Dec 23, 2020 at 6:07 PM Xintong Song <
> > tonysong...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi devs,
> > > > > > >
> > > > > > > Updates on the progress of release.
> > > > > > >
> > > > > > > In the past week, more than 20 issues were resolved for release
> > > > 1.12.1.
> > > > > > > Thanks for the efforts.
> > > > > > >
> > > > > > > We still have 3 unresolved release blockers at the moment.
> > > > > > >
> > > > > > >    - [FLINK-20648] Unable to restore from savepoints with
> > > Kubernetes
> > > > > HA.
> > > > > > >    Consensus has been reached on the solution. @Yang Wang is
> > > working
> > > > > on a
> > > > > > >    PR.
> > > > > > >    - [FLINK-20654] Unaligned checkpoint recovery may lead to
> > > > corrupted
> > > > > > >    data stream.
> > > > > > >    @Roman Khachatryan is still investigating the problem.
> > > > > > >    - [FLINK-20664] Support setting service account for
> > TaskManager
> > > > pod.
> > > > > > >    Boris Lublinsky has opened a PR, which is already reviewed
> and
> > > > close
> > > > > > >    to mergeable.
> > > > > > >
> > > > > > > Since we are targeting a swift release, I'm not intended to
> > further
> > > > > delay
> > > > > > > the release for other non-blocker issues, unless there's a good
> > > > reason.
> > > > > > > If there's anything that you believe is absolutely necessary
> for
> > > > > release
> > > > > > > 1.12.1, please reach out to me.
> > > > > > > Otherwise, the voting process will be started as soon as the
> > above
> > > > > > > blockers are addressed.
> > > > > > >
> > > > > > > Thank you~
> > > > > > >
> > > > > > > Xintong Song
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Dec 21, 2020 at 10:05 AM Xingbo Huang <
> > hxbks...@gmail.com>
> > > > > > wrote:
> > > > > > >
> > > > > > >> Hi Xintong,
> > > > > > >>
> > > > > > >> Thanks a lot for driving this.
> > > > > > >>
> > > > > > >> I'd like to bring one more issue to your attention:
> > > > > > >> https://issues.apache.org/jira/browse/FLINK-20389.
> > > > > > >> This issue occurs quite frequently. Arvid and Kezhu have done
> > some
> > > > > > >> investigations of this issue and it may indicate a bug of the
> > new
> > > > > Source
> > > > > > >> API. It would be great to figure out the root cause of this
> > issue.
> > > > > > >>
> > > > > > >> Best,
> > > > > > >> Xingbo
> > > > > > >>
> > > > > > >> Xintong Song <tonysong...@gmail.com> 于2020年12月18日周五 下午7:49写道:
> > > > > > >>
> > > > > > >> > Thanks for the replies so far.
> > > > > > >> >
> > > > > > >> > I've been reaching out to the owners of the reported issues.
> > It
> > > > > seems
> > > > > > >> most
> > > > > > >> > of the blockers are likely resolved in the next few days.
> > > > > > >> >
> > > > > > >> > Since some of the issues are quite critical, I'd like to aim
> > > for a
> > > > > > >> *feature
> > > > > > >> > freeze on Dec. 23rd*, and start the release voting process
> by
> > > the
> > > > > end
> > > > > > of
> > > > > > >> > this week.
> > > > > > >> >
> > > > > > >> > If there's anything you might need more time for, please
> reach
> > > out
> > > > > to
> > > > > > >> me.
> > > > > > >> >
> > > > > > >> > Thank you~
> > > > > > >> >
> > > > > > >> > Xintong Song
> > > > > > >> >
> > > > > > >> >
> > > > > > >> >
> > > > > > >> > On Fri, Dec 18, 2020 at 3:19 PM Tzu-Li (Gordon) Tai <
> > > > > > >> tzuli...@apache.org>
> > > > > > >> > wrote:
> > > > > > >> >
> > > > > > >> > > Thanks Xintong for driving this.
> > > > > > >> > >
> > > > > > >> > > I'd like to make two more issues related to the Kinesis
> > > > connector
> > > > > > >> changes
> > > > > > >> > > in 1.12.0 a blocker for 1.12.1:
> > > > > > >> > > https://issues.apache.org/jira/browse/FLINK-20630
> > > > > > >> > > https://issues.apache.org/jira/browse/FLINK-20629
> > > > > > >> > >
> > > > > > >> > > There are already PRs for these issues from @Cranmer,
> Danny
> > > > > > >> > > <cranm...@amazon.com>, will try to merge these very soon.
> > > > > > >> > >
> > > > > > >> > > Cheers,
> > > > > > >> > > Gordon
> > > > > > >> > >
> > > > > > >> > > On Fri, Dec 18, 2020 at 1:19 PM Guowei Ma <
> > > guowei....@gmail.com
> > > > >
> > > > > > >> wrote:
> > > > > > >> > >
> > > > > > >> > >> Thanks for driving this release Xintong.
> > > > > > >> > >> I think
> https://issues.apache.org/jira/browse/FLINK-20652
> > > > should
> > > > > > be
> > > > > > >> > >> addressed.
> > > > > > >> > >>
> > > > > > >> > >> Best,
> > > > > > >> > >> Guowei
> > > > > > >> > >>
> > > > > > >> > >>
> > > > > > >> > >> On Fri, Dec 18, 2020 at 11:53 AM Jingsong Li <
> > > > > > jingsongl...@gmail.com
> > > > > > >> >
> > > > > > >> > >> wrote:
> > > > > > >> > >>
> > > > > > >> > >> > Thanks for volunteering as our release manager Xintong.
> > +1
> > > > for
> > > > > > >> > releasing
> > > > > > >> > >> > Flink 1.12.1 soon.
> > > > > > >> > >> >
> > > > > > >> > >> > I think
> > https://issues.apache.org/jira/browse/FLINK-20665
> > > > > should
> > > > > > >> be
> > > > > > >> > >> > addressed, I marked it as a Blocker.
> > > > > > >> > >> >
> > > > > > >> > >> > Best,
> > > > > > >> > >> > Jingsong
> > > > > > >> > >> >
> > > > > > >> > >> > On Fri, Dec 18, 2020 at 11:16 AM Yang Wang <
> > > > > > danrtsey...@gmail.com>
> > > > > > >> > >> wrote:
> > > > > > >> > >> >
> > > > > > >> > >> > > Hi David,
> > > > > > >> > >> > >
> > > > > > >> > >> > > I will take a look this ticket FLINK-20648 and try to
> > get
> > > > it
> > > > > > >> > resolved
> > > > > > >> > >> in
> > > > > > >> > >> > > this release cycle.
> > > > > > >> > >> > >
> > > > > > >> > >> > > @Xintong Song <tonysong...@gmail.com>
> > > > > > >> > >> > > One more Kubernetes HA related issue. We need to
> > support
> > > > > > setting
> > > > > > >> > >> service
> > > > > > >> > >> > > account for TaskManager pod[1]. Even though we have a
> > > work
> > > > > > around
> > > > > > >> > for
> > > > > > >> > >> > this
> > > > > > >> > >> > > issue, but it is not acceptable to always let the
> > default
> > > > > > service
> > > > > > >> > >> account
> > > > > > >> > >> > > with enough permissions.
> > > > > > >> > >> > >
> > > > > > >> > >> > > [1].
> https://issues.apache.org/jira/browse/FLINK-20664
> > > > > > >> > >> > >
> > > > > > >> > >> > > Best,
> > > > > > >> > >> > > Yang
> > > > > > >> > >> > >
> > > > > > >> > >> > >
> > > > > > >> > >> > > David Morávek <david.mora...@gmail.com>
> 于2020年12月18日周五
> > > > > > >> 上午12:47写道:
> > > > > > >> > >> > >
> > > > > > >> > >> > > > Hi, I think
> > > > > > https://issues.apache.org/jira/browse/FLINK-20648
> > > > > > >> > >> should
> > > > > > >> > >> > be
> > > > > > >> > >> > > > addressed, as Kubernetes HA was one of the main
> > selling
> > > > > > points
> > > > > > >> of
> > > > > > >> > >> this
> > > > > > >> > >> > > > release. WDYT?
> > > > > > >> > >> > > >
> > > > > > >> > >> > > > D.
> > > > > > >> > >> > > >
> > > > > > >> > >> > > > Sent from my iPhone
> > > > > > >> > >> > > >
> > > > > > >> > >> > > > > On 17. 12. 2020, at 13:54, Yun Tang <
> > > myas...@live.com>
> > > > > > >> wrote:
> > > > > > >> > >> > > > >
> > > > > > >> > >> > > > > Thanks for driving this quick-fix release.
> > > > > > >> > >> > > > > +1 for fixing the bug of RocksDB state-backend
> with
> > > > > reduce
> > > > > > >> > >> operators.
> > > > > > >> > >> > > > >
> > > > > > >> > >> > > > > Best
> > > > > > >> > >> > > > > Yun Tang
> > > > > > >> > >> > > > > ________________________________
> > > > > > >> > >> > > > > From: Till Rohrmann <trohrm...@apache.org>
> > > > > > >> > >> > > > > Sent: Thursday, December 17, 2020 20:51
> > > > > > >> > >> > > > > To: dev <dev@flink.apache.org>
> > > > > > >> > >> > > > > Subject: Re: [DISCUSS] Releasing Apache Flink
> > 1.12.1
> > > > > > >> > >> > > > >
> > > > > > >> > >> > > > > Thanks for volunteering as our release manager
> > > Xintong.
> > > > > +1
> > > > > > >> for a
> > > > > > >> > >> > swift
> > > > > > >> > >> > > > bug
> > > > > > >> > >> > > > > fix release.
> > > > > > >> > >> > > > >
> > > > > > >> > >> > > > > Cheers,
> > > > > > >> > >> > > > > Till
> > > > > > >> > >> > > > >
> > > > > > >> > >> > > > >> On Thu, Dec 17, 2020 at 1:20 PM Xintong Song <
> > > > > > >> > xts...@apache.org>
> > > > > > >> > >> > > wrote:
> > > > > > >> > >> > > > >>
> > > > > > >> > >> > > > >> Hi devs,
> > > > > > >> > >> > > > >>
> > > > > > >> > >> > > > >> It's been one week since we announced Apache
> Flink
> > > > > 1.12.0,
> > > > > > >> and
> > > > > > >> > >> there
> > > > > > >> > >> > > are
> > > > > > >> > >> > > > >> already many issues reported, some of which are
> > > quite
> > > > > > >> critical.
> > > > > > >> > >> > Thus,
> > > > > > >> > >> > > I
> > > > > > >> > >> > > > >> would like to start a discussion on releasing
> > Flink
> > > > > 1.12.1
> > > > > > >> > soon.
> > > > > > >> > >> > > > >>
> > > > > > >> > >> > > > >> I would like to volunteer for managing this
> > release.
> > > > > > >> > >> > > > >>
> > > > > > >> > >> > > > >> I've noticed the following issues that need to
> be
> > > > > included
> > > > > > >> in
> > > > > > >> > the
> > > > > > >> > >> > new
> > > > > > >> > >> > > > >> bugfix release.
> > > > > > >> > >> > > > >>
> > > > > > >> > >> > > > >>   - The entrypoint script for the official
> docker
> > > > image
> > > > > > does
> > > > > > >> > not
> > > > > > >> > >> > meet
> > > > > > >> > >> > > > the
> > > > > > >> > >> > > > >>   standards of docker-library/official-images
> > repo.
> > > > [1]
> > > > > > >> > >> > > > >>   - Streaming jobs with window-less reduce
> > operation
> > > > do
> > > > > > now
> > > > > > >> > work
> > > > > > >> > >> > with
> > > > > > >> > >> > > > >>   RocksDB state backend. [2]
> > > > > > >> > >> > > > >>   - @Stephan mentioned some Kafka fixes ([3] and
> > > maybe
> > > > > > more)
> > > > > > >> > >> that he
> > > > > > >> > >> > > > would
> > > > > > >> > >> > > > >>   try to make into this release.
> > > > > > >> > >> > > > >>   - @Kurt mentioned a batch workload instability
> > > > related
> > > > > > to
> > > > > > >> > >> managed
> > > > > > >> > >> > > > memory
> > > > > > >> > >> > > > >>   being released slowly, which his team is
> > currently
> > > > > > >> > >> investigating
> > > > > > >> > >> > and
> > > > > > >> > >> > > > >> would
> > > > > > >> > >> > > > >>   try to fix in this release.
> > > > > > >> > >> > > > >>
> > > > > > >> > >> > > > >> Apart from the issues above, please let us know
> in
> > > > this
> > > > > > >> thread
> > > > > > >> > if
> > > > > > >> > >> > > there
> > > > > > >> > >> > > > are
> > > > > > >> > >> > > > >> any other fixes that we should try to include.
> > I'll
> > > > try
> > > > > to
> > > > > > >> > >> > communicate
> > > > > > >> > >> > > > with
> > > > > > >> > >> > > > >> the issue owners and come up with a time
> > estimation
> > > > > early
> > > > > > >> next
> > > > > > >> > >> week.
> > > > > > >> > >> > > > >>
> > > > > > >> > >> > > > >> Thanks,
> > > > > > >> > >> > > > >> Xintong
> > > > > > >> > >> > > > >>
> > > > > > >> > >> > > > >> [1]
> > > https://issues.apache.org/jira/browse/FLINK-20650
> > > > > > >> > >> > > > >> [2]
> > > https://issues.apache.org/jira/browse/FLINK-20646
> > > > > > >> > >> > > > >> [3]
> > > https://issues.apache.org/jira/browse/FLINK-20379
> > > > > > >> > >> > > > >>
> > > > > > >> > >> > > >
> > > > > > >> > >> > >
> > > > > > >> > >> >
> > > > > > >> > >> >
> > > > > > >> > >> > --
> > > > > > >> > >> > Best, Jingsong Lee
> > > > > > >> > >> >
> > > > > > >> > >>
> > > > > > >> > >
> > > > > > >> >
> > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to