Hey folks, I'm deploying a Flink cluster via kubernetes, and starting each task manager with taskmanager.sh. I noticed that when I tell kubectl to delete the deployment, the job-manager pod usually terminates very quickly, but any task-manager that doesn't get terminated before the job-manager, usually gets stuck in this loop:
2020-01-29 09:18:47,867 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor - Could not resolve ResourceManager address akka.tcp://flink@job-manager:6123/user/resourcemanager, retrying in 10000 ms: Could not connect to rpc endpoint under address akka.tcp://flink@job-manager:6123/user/resourcemanager It then does this for about 10 minutes(?), and then shuts down. If I'm deploying a new cluster, this pod will try to register itself with the new job manager before terminating lter. This isn't a troubling issue as far as I can tell, but I find it annoying that I sometimes have to force delete the pods. Any easy ways to just have the task managers terminate gracefully and quickly? Thanks, Li