[ 
https://issues.apache.org/jira/browse/FLINK-29200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aitozi updated FLINK-29200:
---------------------------
    Description: 
Currently, if the TaskManager heartbeat timeout the pod will be deleted 
immediately. It's not very convenient for debugging the internal reason, eg: we 
can not easily get the core dump files if it's crashed for JVM bugs and so on.

So, I propose to introduce an option to control the delay of the pod deletion, 
it can be enabled to keep the pod alive for some debugging reason.

  was:
Currently, if the TaskManager heartbeat timeout the pod will be deleted 
immediately. It's not very convenient for debugging the internal reason, eg: we 
can not easily get the core dump files if it's crashed for JVM bugs and so on.

So, I purpose to introduce an option to control the delay of the pod deletion, 
it can be enabled to keep the pod alive for some debugging reason.


> Provide the way to delay the pod deletion for debugging purpose
> ---------------------------------------------------------------
>
>                 Key: FLINK-29200
>                 URL: https://issues.apache.org/jira/browse/FLINK-29200
>             Project: Flink
>          Issue Type: Improvement
>          Components: Deployment / Kubernetes
>            Reporter: Aitozi
>            Priority: Major
>
> Currently, if the TaskManager heartbeat timeout the pod will be deleted 
> immediately. It's not very convenient for debugging the internal reason, eg: 
> we can not easily get the core dump files if it's crashed for JVM bugs and so 
> on.
> So, I propose to introduce an option to control the delay of the pod 
> deletion, it can be enabled to keep the pod alive for some debugging reason.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to