[ https://issues.apache.org/jira/browse/FLINK-36673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17931787#comment-17931787 ]
Gyula Fora commented on FLINK-36673: ------------------------------------ I think this is a duplicate of https://issues.apache.org/jira/browse/FLINK-37370 and has been fixed on the main/release-11 branch Can you please confirm? > Operator is not properly handling failed deployments without savepoints > ----------------------------------------------------------------------- > > Key: FLINK-36673 > URL: https://issues.apache.org/jira/browse/FLINK-36673 > Project: Flink > Issue Type: Bug > Components: Kubernetes Operator > Reporter: Yaroslav Tkachenko > Priority: Major > Attachments: Screenshot 2025-02-28 at 4.15.26 PM.png, Screenshot > 2025-02-28 at 8.51.37 PM.png, Screenshot 2025-02-28 at 8.55.36 PM.png, > stacktrace.txt > > > I noticed an issue after upgrading Flink Kubernetes Operator from 1.9 to 1.10. > When I deploy a FlinkDeployment that fails during the startup, I get a > "ReconciliationException: Could not observe latest savepoint information" > (full stacktrace is attached). > I think the issue was introduced here: > [https://github.com/apache/flink-kubernetes-operator/pull/871.] > *AbstractFlinkService.getLastCheckpoint* now throws a > *ReconciliationException* when a savepoint is not available, and > *SnapshotObserver.observeLatestCheckpoint* doesn't handle it properly. I > think having no savepoint is completely normal in some situations (e.g. a > brand new job). -- This message was sent by Atlassian Jira (v8.20.10#820010)