Hi guys,
It looks suspicious that the TM pod termination is potentially delayed by
the reconnect to a killed JM.
I created an issue to investigate this:
https://issues.apache.org/jira/browse/FLINK-15946
Let's continue the discussion there.
Best,
Andrey
On Wed, Feb 5, 2020 at 11:49 AM Yang Wang
Maybe you need to check the kubelet logs to see why it get stuck in the
"Terminating" state
for long time. Even it needs to clean up the ephemeral storage, it should
not take so long
time.
Best,
Yang
Li Peng 于2020年2月5日周三 上午10:42写道:
> My yml files follow most of the instructions here:
>
>
> htt
My yml files follow most of the instructions here:
http://shzhangji.com/blog/2019/08/24/deploy-flink-job-cluster-on-kubernetes/
What command did you use to delete the deployments? I use : helm
--tiller-namespace prod delete --purge my-deployment
I noticed that for environments without much data
>> By the way, the default registration timeout is 5min [2], those
>>>> taskmanager could not register to the JM will suicide after 5 minutes.
>>>>
>>>> [1]
>>>> https://ci.apache.org/projects/flink/flink-docs-stable/ops/deployment/kubernetes.htm
ssion-cluster-resource-definitions
>>> [2]
>>> https://github.com/apache/flink/blob/7e1a0f446e018681cb537dd936ae54388b5a7523/flink-core/src/main/java/org/apache/flink/configuration/TaskManagerOptions.java#L158
>>>
>>> Best
>>> Yun Tang
>>>
&g
ubernetes.html#session-cluster-resource-definitions
>> [2]
>> https://github.com/apache/flink/blob/7e1a0f446e018681cb537dd936ae54388b5a7523/flink-core/src/main/java/org/apache/flink/configuration/TaskManagerOptions.java#L158
>>
>> Best
>> Yun Tang
>>
>> -
> ------
> *From:* Li Peng
> *Sent:* Thursday, January 30, 2020 9:24
> *To:* user
> *Subject:* Task-manager kubernetes pods take a long time to terminate
>
> Hey folks, I'm deploying a Flink cluster via kubernetes, and starting each
> task m
netes pods take a long time to terminate
Hey folks, I'm deploying a Flink cluster via kubernetes, and starting each task
manager with taskmanager.sh. I noticed that when I tell kubectl to delete the
deployment, the job-manager pod usually terminates very quickly, but any
task-manager that d
Hey folks, I'm deploying a Flink cluster via kubernetes, and starting each
task manager with taskmanager.sh. I noticed that when I tell kubectl to
delete the deployment, the job-manager pod usually terminates very quickly,
but any task-manager that doesn't get terminated before the job-manager,
usu