Hi devs, I'd like to start a discussion on FLIP-416: Deprecate and remove the RestoreMode#LEGACY[1].
The FLIP-193[2] introduced two modes of state file ownership during checkpoint restoration: RestoreMode#CLAIM and RestoreMode#NO_CLAIM. The LEGACY mode, which was how Flink worked until 1.15, has been superseded by NO_CLAIM as the default mode. The main drawback of LEGACY mode is that the new job relies on artifacts from the old job without cleaning them up, leaving users uncertain about when it is safe to delete the old checkpoint directories. This leads to the accumulation of unnecessary checkpoint files that are never cleaned up. Considering cluster availability and job maintenance, it is not recommended to use LEGACY mode. Users could choose the other two modes to get a clear semantic for the state file ownership. This FLIP proposes to deprecate the LEGACY mode and remove it completely in the upcoming Flink 2.0. This will make the semantic clear as well as eliminate many bugs caused by mode transitions involving LEGACY mode (e.g. FLINK-27114 [3]) and enhance code maintainability. Looking forward to hearing from you! [1] https://cwiki.apache.org/confluence/x/ookkEQ [2] https://cwiki.apache.org/confluence/x/bIyqCw [3] https://issues.apache.org/jira/browse/FLINK-27114 Best, Zakelly