[ https://issues.apache.org/jira/browse/FLINK-29200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Aitozi updated FLINK-29200: --------------------------- Description: Currently, if the TaskManager heartbeat timeout the pod will be deleted immediately. It's not very convenient for debugging the internal reason, eg: we can not easily get the core dump files if it's crashed for JVM bugs and so on. So, I propose to introduce an option to control the delay of the pod deletion, it can be enabled to keep the pod alive for some debugging reason. was: Currently, if the TaskManager heartbeat timeout the pod will be deleted immediately. It's not very convenient for debugging the internal reason, eg: we can not easily get the core dump files if it's crashed for JVM bugs and so on. So, I purpose to introduce an option to control the delay of the pod deletion, it can be enabled to keep the pod alive for some debugging reason. > Provide the way to delay the pod deletion for debugging purpose > --------------------------------------------------------------- > > Key: FLINK-29200 > URL: https://issues.apache.org/jira/browse/FLINK-29200 > Project: Flink > Issue Type: Improvement > Components: Deployment / Kubernetes > Reporter: Aitozi > Priority: Major > > Currently, if the TaskManager heartbeat timeout the pod will be deleted > immediately. It's not very convenient for debugging the internal reason, eg: > we can not easily get the core dump files if it's crashed for JVM bugs and so > on. > So, I propose to introduce an option to control the delay of the pod > deletion, it can be enabled to keep the pod alive for some debugging reason. -- This message was sent by Atlassian Jira (v8.20.10#820010)