GitHub user makeyang opened a pull request: https://github.com/apache/flink/pull/6019
[FLINK-9182]async checkpoints for timer service ## What is the purpose of the change it is for async checkpoints for timer service the whole idea is based on discussion in previous PR for FLINK-9182 in this link:https://github.com/apache/flink/pull/5908 ## Brief change log in sync part flat copy of the internal array of the priority queue in async part build key group and write timer key group ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) You can merge this pull request into a Git repository by running: $ git pull https://github.com/makeyang/flink FLINK-9182-version2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/6019.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #6019 ---- commit 82799922203bd6cb959c11336f71aee4def431d7 Author: makeyang <makeyang@...> Date: 2018-05-16T05:44:16Z [FLINK-9182]async checkpoints for timer service the whole idea is based on discussion on github: https://github.com/apache/flink/pull/5908 the idea is propesed by StefanRRichter as below: "Second, I would probably suggest a simpler model for the async snapshots. You dropped the idea of making flat copies, but I wonder if this was premature. I can see that calling set.toArray(...) per keygroup could (maybe) turn out a bit slow because it has to potentially iterate and flatten linked entries. However, with async snapshots, we could get rid of the key-group partitioning of sets, and instead do a flat copy of the internal array of the priority queue. This would translate to just a single memcopy call internally, which is very efficient. In the async part, we can still partition the timers by key-group in a similar way as the copy-on-write state table does. This would avoid slowing down the event processing path (in fact improving it be unifying the sets) and also keep the approach very straight forward and less invasive." ---- ---