Roman Khachatryan created FLINK-28597:
-----------------------------------------
Summary: Empty checkpoint folders not deleted on job cancellation
if their shared state is still in use
Key: FLINK-28597
URL: https://issues.apache.org/jira/browse/FLINK-28597
Project: Flink
Issue Type: Bug
Components: Runtime / Checkpointing
Affects Versions: 1.16.0
Reporter: Roman Khachatryan
Assignee: Roman Khachatryan
Fix For: 1.16.0
After FLINK-25872, SharedStateRegistry registers all state handles, including
private ones.
Once the state isn't use AND the checkpoint is subsumed, it will actually be
discarded.
This is done to prevent premature deletion when recovering in CLAIM mode:
1. RocksDB native savepoint folder (shared state is stored in chk-xx folder so
it might fail the deletion)
2. Initial non-changelog checkpoint when switching to changelog-based
checkpoints (private state of the initial checkpoint might be included into
later checkpoints and its deletion would invalidate them)
Additionally, checkpoint folders are not deleted for a longer time which might
be confusing.
In case of a crash, more folders will remain.
cc: [~Yanfei Lei], [~ym]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)