Yang Wang created FLINK-25865:
---------------------------------

             Summary: Support to set restart policy of TaskManager pod for 
native K8s integration
                 Key: FLINK-25865
                 URL: https://issues.apache.org/jira/browse/FLINK-25865
             Project: Flink
          Issue Type: Improvement
          Components: Deployment / Kubernetes
            Reporter: Yang Wang


After FLIP-201, Flink's TaskManagers will be able to be restarted without 
losing its local state. So it is reasonable to make the restart policy[1] of 
TaskManager pod could be configured.

The current restart policy is {{{}Never{}}}. Flink will always delete the 
failed TaskManager pod directly and create a new one instead. This ticket could 
help to decrease the recovery time of TaskManager failure.

 

Please note that the working directory needs to be located in the emptyDir[1], 
which is retained in different restarts.

 

[1]. 
https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#restart-policy

[2]. https://kubernetes.io/docs/concepts/storage/volumes/#emptydir



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to