Gyula Fora created FLINK-26140: ---------------------------------- Summary: Add basic handling mechanism to deal with job upgrade errors Key: FLINK-26140 URL: https://issues.apache.org/jira/browse/FLINK-26140 Project: Flink Issue Type: Sub-task Components: Deployment / Kubernetes Reporter: Gyula Fora
There are various different ways how a stateful job upgrade can fail. For example: - Failure/timeout during savepoint - Incompatible state - Corrupted / not-found checkpoint - Error after restart We should allow some strategies for the user to declare how to handle the different error scenarios (such as roll back to earlier state) and what should be treated as a fatal error. -- This message was sent by Atlassian Jira (v8.20.1#820001)