Hi Flink devs,

Congxian, Kostas, and I have recently been discussing to unify the binary
formats for keyed state in savepoints, which would allow for more
operational flexibility such as swapping state backends across restores.

As part of this FLIP, another main proposal is to start allowing
checkpoints and savepoints to have different formats. Savepoint formats
should in the future be designed with interoperability in mind and
reasonable snapshot / restore overhead is tolerable, while checkpoints are
allowed to be backend specific for more efficient snapshots and restores.
>From recent proposals in the state backends such as disk-spilling heap
backend [1], this flexibility seems to be reasonable.

The main user-facing API this would affect is of course, the binary formats
of savepoints, as well as the fact that we will no longer be guaranteeing
functional parity between savepoints and full checkpoints in the future
(w.r.t. operational features related to upgrading applications; so far they
have equal functionality).

Therefore, we would like to collect feedback on the proposal before
continuing efforts.

This is the FLIP:
https://cwiki.apache.org/confluence/display/FLINK/FLIP-41%3A+Unify+Keyed+State+Snapshot+Binary+Format+for+Savepoints
.

I'm happy to discuss details and looking forward to any feedback.

Cheers,
Gordon

[1]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Proposal-to-support-disk-spilling-in-HeapKeyedStateBackend-td29109.html

Reply via email to