+1 (binding) from my side

  - legal files (license, notice) looks correct
  - no binaries in the release
  - ran examples from command line
  - ran some examples from web ui
  - log files look sane
  - RocksDB, incremental checkpoints, savepoints, moving savepoints
all works as expected.

There are some friction points, which have also been mentioned. However, I
am not sure they need to block the release.
  - Some batch examples in the web UI have not been working in 1.10. We
should fix that asap, because it impacts the "getting started" experience,
but I personally don't vote against the release based on that
  - Same for the CDC bug. It is unfortunate, but I would not hold the
release at such a late stage for one special issue in a new connector.
Let's work on a timely 1.11.1.


I would withdraw my vote, if we find a fundamental issue in the network
system causing the increased checkpoint delays, causing the job regression
Thomas mentioned.
Such a core bug would be a deal-breaker for a large fraction of users.




On Thu, Jul 2, 2020 at 11:35 AM Zhijiang <wangzhijiang...@aliyun.com.invalid>
wrote:

> I also agree with Till and Robert's proposals.
>
> In general I think we should not block the release based on current
> estimation. Otherwise we continuously postpone the release, it might
> probably occur new bugs for blockers, then we might probably
> get stuck in such cycle to not give a final release for users in time. But
> that does not mean RC4 would be the final one, and we can reevaluate the
> effects in progress with the accumulated issues.
>
> Regarding the performance regression, if possible we can reproduce to
> analysis the reason based on Thomas's feedback, then we can evaluate its
> effect.
>
> Regarding the FLINK-18461, after syncing with Jark offline, the bug would
> effect one of three scenarios for using CDC feature, and this effected
> scenario is actually the most commonly used way by users.
> My suggestion is to merge it into release-1.11 ATM since the PR already
> open for review, then let's further finalize the conclusion later. If this
> issue is the only one after RC4 going through, then another option is to
> cover it in next release-1.11.1 as Robert suggested, as we can prepare for
> the next minor release soon. If there are other blockers issues during
> voting and necessary to be resolved soon, then it is no doubt to cover all
> of them in next RC5.
>
> Best,
> Zhijiang
>
>
> ------------------------------------------------------------------
> From:Till Rohrmann <trohrm...@apache.org>
> Send Time:2020年7月2日(星期四) 16:46
> To:dev <dev@flink.apache.org>
> Cc:Zhijiang <wangzhijiang...@aliyun.com>
> Subject:Re: [VOTE] Release 1.11.0, release candidate #4
>
> I agree with Robert.
>
> @Chesnay: The problem has probably already existed in Flink 1.10 and
> before because we cannot run jobs with eager execution calls from the web
> ui. I agree with Robert that we can/should improve our documentation in
> this regard, though.
>
> @Thomas:
> 1. I will update the release notes to add a short section describing that
> one needs to configure the JobManager memory.
> 2. Concerning the performance regression we should look into it. I believe
> Zhijiang is very eager to learn more about your exact setup to further
> debug it. Again I agree with Robert to not block the release on it at the
> moment.
>
> @Jark: How much of a problem is FLINK-18461? Will it make the CDC feature
> completely unusable or will only make a subset of the use cases to not
> work? If it is the latter, then I believe that we can document the
> limitations and try to fix it asap. Depending on the remaining testing the
> fix might make it into the 1.11.0 or the 1.11.1 release.
>
> Cheers,
> Till
> On Thu, Jul 2, 2020 at 10:33 AM Robert Metzger <rmetz...@apache.org>
> wrote:
> Thanks a lot for the thorough testing Thomas! This is really helpful!
>
>  @Chesnay: I would not block the release on this. The web submission does
>  not seem to be the documented / preferred way of job submission. It is
>  unlikely to harm the beginner's experience (and they would anyways not
> read
>  the release notes). I mention the beginner experience, because they are
> the
>  primary audience of the examples.
>
>  Regarding FLINK-18461 / Jark's issue: I would not block the release on
>  that, but still try to get it fixed asap. It is likely that this RC
> doesn't
>  go through (given the rate at which we are finding issues), and even if it
>  goes through, we can document it as a known issue in the release
>  announcement and immediately release 1.11.1.
>  Blocking the release on this causes quite a bit of work for the release
>  managers for rolling a new RC. Until we have understood the performance
>  regression Thomas is reporting, I would keep this RC open, and keep
> testing.
>
>
>  On Thu, Jul 2, 2020 at 8:34 AM Jark Wu <imj...@gmail.com> wrote:
>
>  > Hi,
>  >
>  > I'm very sorry but we just found a blocker issue FLINK-18461 [1] in the
> new
>  > feature of changelog source (CDC).
>  > This bug will result in queries on changelog source can’t be inserted
> into
>  > upsert sink (e.g. ES, JDBC, HBase),
>  > which is a common case in production. CDC is one of the important
> features
>  > of Table/SQL in this release,
>  > so from my side, I hope we can have this fix in 1.11.0, otherwise, this
> is
>  > a broken feature...
>  >
>  > Again, I am terribly sorry for delaying the release...
>  >
>  > Best,
>  > Jark
>  >
>  > [1]: https://issues.apache.org/jira/browse/FLINK-18461
>  >
>  > On Thu, 2 Jul 2020 at 12:02, Zhijiang <wangzhijiang...@aliyun.com
> .invalid>
>  > wrote:
>  >
>  > > Hi Thomas,
>  > >
>  > > Thanks for the efficient feedback.
>  > >
>  > > Regarding the suggestion of adding the release notes document, I agree
>  > > with your point. Maybe we should adjust the vote template accordingly
> in
>  > > the respective wiki to guide the following release processes.
>  > >
>  > > Regarding the performance regression, could you provide some more
> details
>  > > for our better measurement or reproducing on our sides?
>  > > E.g. I guess the topology only includes two vertexes source and sink?
>  > > What is the parallelism for every vertex?
>  > > The upstream shuffles data to the downstream via rebalance
> partitioner or
>  > > other?
>  > > The checkpoint mode is exactly-once with rocksDB state backend?
>  > > The backpressure happened in this case?
>  > > How much percentage regression in this case?
>  > >
>  > > Best,
>  > > Zhijiang
>  > >
>  > >
>  > >
>  > > ------------------------------------------------------------------
>  > > From:Thomas Weise <t...@apache.org>
>  > > Send Time:2020年7月2日(星期四) 09:54
>  > > To:dev <dev@flink.apache.org>
>  > > Subject:Re: [VOTE] Release 1.11.0, release candidate #4
>  > >
>  > > Hi Till,
>  > >
>  > > Yes, we don't have the setting in flink-conf.yaml.
>  > >
>  > > Generally, we carry forward the existing configuration and any change
> to
>  > > default configuration values would impact the upgrade.
>  > >
>  > > Yes, since it is an incompatible change I would state it in the
> release
>  > > notes.
>  > >
>  > > Thanks,
>  > > Thomas
>  > >
>  > > BTW I found a performance regression while trying to upgrade another
>  > > pipeline with this RC. It is a simple Kinesis to Kinesis job. Wasn't
> able
>  > > to pin it down yet, symptoms include increased checkpoint alignment
> time.
>  > >
>  > > On Wed, Jul 1, 2020 at 12:04 AM Till Rohrmann <trohrm...@apache.org>
>  > > wrote:
>  > >
>  > > > Hi Thomas,
>  > > >
>  > > > just to confirm: When starting the image in local mode, then you
> don't
>  > > have
>  > > > any of the JobManager memory configuration settings configured in
> the
>  > > > effective flink-conf.yaml, right? Does this mean that you have
>  > explicitly
>  > > > removed `jobmanager.heap.size: 1024m` from the default
> configuration?
>  > If
>  > > > this is the case, then I believe it was more of an unintentional
>  > artifact
>  > > > that it worked before and it has been corrected now so that one
> needs
>  > to
>  > > > specify the memory of the JM process explicitly. Do you think it
> would
>  > > help
>  > > > to explicitly state this in the release notes?
>  > > >
>  > > > Cheers,
>  > > > Till
>  > > >
>  > > > On Wed, Jul 1, 2020 at 7:01 AM Thomas Weise <t...@apache.org> wrote:
>  > > >
>  > > > > Thanks for preparing another RC!
>  > > > >
>  > > > > As mentioned in the previous RC thread, it would be super helpful
> if
>  > > the
>  > > > > release notes that are part of the documentation can be included
> [1].
>  > > > It's
>  > > > > a significant time-saver to have read those first.
>  > > > >
>  > > > > I found one more non-backward compatible change that would be
> worth
>  > > > > addressing/mentioning:
>  > > > >
>  > > > > It is now necessary to configure the jobmanager heap size in
>  > > > > flink-conf.yaml (with either jobmanager.heap.size
>  > > > > or jobmanager.memory.heap.size). Why would I not want to do that
>  > > anyways?
>  > > > > Well, we set it dynamically for a cluster deployment via the
>  > > > > flinkk8soperator, but the container image can also be used for
>  > testing
>  > > > with
>  > > > > local mode (./bin/jobmanager.sh start-foreground local). That will
>  > fail
>  > > > if
>  > > > > the heap wasn't configured and that's how I noticed it.
>  > > > >
>  > > > > Thanks,
>  > > > > Thomas
>  > > > >
>  > > > > [1]
>  > > > >
>  > > > >
>  > > >
>  > >
>  >
> https://ci.apache.org/projects/flink/flink-docs-release-1.11/release-notes/flink-1.11.html
>  > > > >
>  > > > > On Tue, Jun 30, 2020 at 3:18 AM Zhijiang <
> wangzhijiang...@aliyun.com
>  > > > > .invalid>
>  > > > > wrote:
>  > > > >
>  > > > > > Hi everyone,
>  > > > > >
>  > > > > > Please review and vote on the release candidate #4 for the
> version
>  > > > > 1.11.0,
>  > > > > > as follows:
>  > > > > > [ ] +1, Approve the release
>  > > > > > [ ] -1, Do not approve the release (please provide specific
>  > comments)
>  > > > > >
>  > > > > > The complete staging area is available for your review, which
>  > > includes:
>  > > > > > * JIRA release notes [1],
>  > > > > > * the official Apache source release and binary convenience
>  > releases
>  > > to
>  > > > > be
>  > > > > > deployed to dist.apache.org [2], which are signed with the key
>  > with
>  > > > > > fingerprint 2DA85B93244FDFA19A6244500653C0A2CEA00D0E [3],
>  > > > > > * all artifacts to be deployed to the Maven Central Repository
> [4],
>  > > > > > * source code tag "release-1.11.0-rc4" [5],
>  > > > > > * website pull request listing the new release and adding
>  > > announcement
>  > > > > > blog post [6].
>  > > > > >
>  > > > > > The vote will be open for at least 72 hours. It is adopted by
>  > > majority
>  > > > > > approval, with at least 3 PMC affirmative votes.
>  > > > > >
>  > > > > > Thanks,
>  > > > > > Release Manager
>  > > > > >
>  > > > > > [1]
>  > > > > >
>  > > > >
>  > > >
>  > >
>  >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12346364
>  > > > > > [2]
> https://dist.apache.org/repos/dist/dev/flink/flink-1.11.0-rc4/
>  > > > > > [3] https://dist.apache.org/repos/dist/release/flink/KEYS
>  > > > > > [4]
>  > > > > >
>  > > >
>  > https://repository.apache.org/content/repositories/orgapacheflink-1377/
>  > > > > > [5]
>  > https://github.com/apache/flink/releases/tag/release-1.11.0-rc4
>  > > > > > [6] https://github.com/apache/flink-web/pull/352
>  > > > > >
>  > > > > >
>  > > > >
>  > > >
>  > >
>  > >
>  >
>
>

Reply via email to