Hi Thomas, Just one quick answer from my side about:
> * notifyCheckpointAborted needed to be implemented > for org.apache.flink.runtime.state.CheckpointListener - can we have the > default implementation in the interface so that users aren't forced to > change their implementations This is intentional design [1] > Implementers should generally be forced to think about what to do when > checkpoint is aborted. Piotrek [1] https://github.com/apache/flink/pull/8693#issuecomment-542834147 Piotr Nowojski | Staff Engineer +48 503 187 389 Follow us @VervericaData -- Join Flink Forward - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Yip Park Tung Jason, Jinwei (Kevin) Zhang, Karl Anton Wehner > On 24 Jun 2020, at 10:35, Till Rohrmann <trohrm...@apache.org> wrote: > > Thanks for testing the RC and the feedback Thomas. The problem with the > taskmanager options is that the old > (taskmanager.initial-registration-pause) and new options > (cluster.registration.initial-timeout) don't have the same type. The old > options have not been used for a long time (since version 1.5.0) and we > wanted to remove them. As part of the removal, we added the old keys as > deprecated options for the new ones. I believe this was a mistake. I've > opened a PR to remove the deprecated keys from the new ConfigOptions [1]. > > Please be aware that > > "taskmanager.initial-registration-pause": "500ms", > "taskmanager.max-registration-pause": "5s", > "taskmanager.refused-registration-pause": "5s", > > Shouldn't have any effects anymore (since version 1.5.0). > > [1] https://github.com/apache/flink/pull/12763 > > Cheers, > Till > > On Wed, Jun 24, 2020 at 4:17 AM Zhijiang <wangzhijiang...@aliyun.com.invalid> > wrote: > >> Hi Thomas, >> >> Thanks for these valuable feedbacks and suggestions, and I think they are >> very helpful for making us better. >> >> I can give an direct answer for this issue: >>> checkpoint alignment buffered metric missing - note that this job isn't >> using the new unaligned checkpointing that should be opt-in. >> >> The metric of checkpoint alignment buffered would be always 0 now, no >> matter with unaligned checkpointing or not, so we removed this metric >> directly. >> The motivation for such change is from reducing in-flight buffers to speed >> up checkpoint somehow. The upstream side would block sending any following >> buffers after sending the barrier until receiving the alignment >> notification from downstream side. Therefore, the downstream side never >> needs to cache >> buffers for blocked channels during alignment. We also illustrated such >> changes in release notes for attention by link [1]. >> >> [1] >> https://github.com/apache/flink/pull/12699/files#diff-eaa874e007e88f283e96de2d61cc4140R174 >> >> Best, >> Zhijiang >> ------------------------------------------------------------------ >> From:Thomas Weise <t...@apache.org> >> Send Time:2020年6月24日(星期三) 06:51 >> To:dev <dev@flink.apache.org> >> Cc:zhijiang <zhiji...@apache.org> >> Subject:Re: [ANNOUNCE] Apache Flink 1.11.0, release candidate #2 >> >> Hi, >> >> Thanks for putting together the RC! >> >> I have some preliminary feedback from testing with commit >> 934f91ead00fd658333f65ffa37ab60bd5ffd99b >> >> An internal benchmark application that reads from Kinesis and checkpoints >> ~12GB performs comparably to 1.10.1 >> >> There were a few issues hit upgrading our codebase that may be worthwhile >> considering, please see details below. >> >> Given my observations over the past few releases, I would like to suggest >> that the community introduces a log of incompatible changes to be published >> with the release notes. Though it is possible to analyze git history when >> hitting compile errors, there are more subtle changes that can make >> upgrades unnecessarily time-consuming. Contributors introducing such >> changes are probably in the best position to document. >> >> I'm planning to try this or the next RC with a couple more applications. >> >> Cheers, >> Thomas >> >> * notifyCheckpointAborted needed to be implemented >> for org.apache.flink.runtime.state.CheckpointListener - can we have the >> default implementation in the interface so that users aren't forced to >> change their implementations >> >> * following deprecated configuration values had to be modified to get >> the job running: >> >> "taskmanager.initial-registration-pause": "500ms", >> "taskmanager.max-registration-pause": "5s", >> "taskmanager.refused-registration-pause": "5s", >> >> The error message was: >> >> Could not parse value '500ms' for key >> 'cluster.registration.initial-timeout'.\n\tat >> >> org.apache.flink.configuration.Configuration.getOptional(Configuration.java:753)\n\tat >> >> org.apache.flink.configuration.Configuration.getLong(Configuration.java:298)\n\tat >> >> org.apache.flink.runtime.registration.RetryingRegistrationConfiguration.fromConfiguration(RetryingRegistrationConfiguration.java:72)\n\tat >> >> org.apache.flink.runtime.taskexecutor.TaskManagerServicesConfiguration.fromConfiguration(TaskManagerServicesConfiguration.java:262)\n\tat >> >> Though easy to fix, it's unfortunate that values are now treated >> differently. >> >> * checkpoint alignment buffered metric missing - note that this job isn't >> using the new unaligned checkpointing that should be opt-in. >> >> * -import org.apache.flink.table.api.java.StreamTableEnvironment; >> +import org.apache.flink.table.api.bridge.java.StreamTableEnvironment; >> >> * -ClientUtils.executeProgram(DefaultExecutorServiceLoader.INSTANCE, >> config, program.build()); >> +ClientUtils.executeProgram(DefaultExecutorServiceLoader.INSTANCE, >> config, program.build(), >> false, false); >> >> * ProcessingTimeCallback removed from StreamingFileSink >> >> >> On Wed, Jun 17, 2020 at 6:29 AM Piotr Nowojski <pnowoj...@apache.org> >> wrote: >> >>> Hi all, >>> >>> I would like to give an update about the RC2 status. We are now waiting >> for >>> a green azure build on one final bug fix before creating RC2. This bug >> fix >>> should be merged late afternoon/early evening Berlin time, so RC2 will be >>> hopefully created tomorrow morning. Until then I would ask to not >>> merge/backport commits to release-1.11 branch, including bug fixes. If >> you >>> have something that's truly essential and should be treated as a release >>> blocker, please reach out to me or Zhijiang. >>> >>> Best, >>> Piotr Nowojski >>> >> >>