Maybe you need to check the kubelet logs to see why it get stuck in the
"Terminating" state
for long time. Even it needs to clean up the ephemeral storage, it should
not take so long
time.


Best,
Yang

Li Peng <li.p...@doordash.com> 于2020年2月5日周三 上午10:42写道:

> My yml files follow most of the instructions here:
>
>
> http://shzhangji.com/blog/2019/08/24/deploy-flink-job-cluster-on-kubernetes/
>
> What command did you use to delete the deployments? I use : helm
> --tiller-namespace prod delete --purge my-deployment
>
> I noticed that for environments without much data (like staging), this
> works flawlessly, but in production with high volume of data, it gets stuck
> in a loop. I suspect that the extra time needed to cleanup the task
> managers with high traffic, delays the shutdown until after the job manager
> terminates, and then the task manager gets stuck in a loop when it detects
> the job manager is dead.
>
> Thanks,
> Li
>
>>

Reply via email to