Gyula Fora created FLINK-39507:
----------------------------------

             Summary: Cluster health check should run on terminal / failed jobs
                 Key: FLINK-39507
                 URL: https://issues.apache.org/jira/browse/FLINK-39507
             Project: Flink
          Issue Type: Bug
          Components: Kubernetes Operator
            Reporter: Gyula Fora
            Assignee: Gyula Fora


Currently the cluster / job health check logic is sometimes executed on 
terminal/failed jobs which can lead to the operator trying to restart these 
from HA metadata inevitably leading to an unrecoverable failure. 

We should simply exclude these deployments based on the job status.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to