So do you start your Flink cluster on K8s with the yaml here[1]? I have
tested multiple times, and
it always works well. If not, could you share your yaml file with me?

[1].
https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/kubernetes.html#session-cluster-resource-definitions

Best,
Yang

Li Peng <li.p...@doordash.com> 于2020年2月5日周三 上午5:53写道:

> Hey Yang,
>
> The jobmanager and taskmanagers are all part of the same deployment, when
> I delete the deployment all the pods are told to be terminated.
>
> The status of the taskmanager is "terminating", and it waits until the
> taskmanager times out in that error loop before it actually terminates.
>
> Thanks,
> Li
>
> On Thu, Jan 30, 2020 at 6:22 PM Yang Wang <danrtsey...@gmail.com> wrote:
>
>> I think if you want to delete your Flink cluster on K8s, then you need to
>> directly delete all the
>> created deployments(jobmanager deploy, taskmanager deploy). For the
>> configmap and service,
>> you could leave them there if you want to reuse them by the next Flink
>> cluster deploy.
>>
>> What's the status of taskmanager pod when you delete it and get stuck?
>>
>>
>> Best,
>> Yang
>>
>> Li Peng <li.p...@doordash.com> 于2020年1月31日周五 上午4:51写道:
>>
>>> Hi Yun,
>>>
>>> I'm currently specifying that specific RPC address in my kubernetes
>>> charts for conveniene, should I be generating a new one for every
>>> deployment?
>>>
>>> And yes, I am deleting the pods using those commands, I'm just noticing
>>> that the task-manager termination process is short circuited by the
>>> registration timeout check, so that instead of terminating quickly, the
>>> task-manger would wait for 5 minutes to timeout before terminating. I'm
>>> expecting it to just terminate without doing that registration timeout, is
>>> there a way to configure that?
>>>
>>> Thanks,
>>> Li
>>>
>>>
>>> On Thu, Jan 30, 2020 at 8:53 AM Yun Tang <myas...@live.com> wrote:
>>>
>>>> Hi Li
>>>>
>>>> Why you still use ’job-manager' as thejobmanager.rpc.address for the
>>>> second new cluster? If you use another rpc address, previous task managers
>>>> would not try to register with old one.
>>>>
>>>> Take flink documentation [1] for k8s as example. You can list/delete
>>>> all pods like:
>>>>
>>>> kubectl get/delete pods -l app=flink
>>>>
>>>>
>>>> By the way, the default registration timeout is 5min [2], those
>>>> taskmanager could not register to the JM will suicide after 5 minutes.
>>>>
>>>> [1]
>>>> https://ci.apache.org/projects/flink/flink-docs-stable/ops/deployment/kubernetes.html#session-cluster-resource-definitions
>>>> [2]
>>>> https://github.com/apache/flink/blob/7e1a0f446e018681cb537dd936ae54388b5a7523/flink-core/src/main/java/org/apache/flink/configuration/TaskManagerOptions.java#L158
>>>>
>>>> Best
>>>> Yun Tang
>>>>
>>>> ------------------------------
>>>> *From:* Li Peng <li.p...@doordash.com>
>>>> *Sent:* Thursday, January 30, 2020 9:24
>>>> *To:* user <user@flink.apache.org>
>>>> *Subject:* Task-manager kubernetes pods take a long time to terminate
>>>>
>>>> Hey folks, I'm deploying a Flink cluster via kubernetes, and starting
>>>> each task manager with taskmanager.sh. I noticed that when I tell kubectl
>>>> to delete the deployment, the job-manager pod usually terminates very
>>>> quickly, but any task-manager that doesn't get terminated before the
>>>> job-manager, usually gets stuck in this loop:
>>>>
>>>> 2020-01-29 09:18:47,867 INFO
>>>>  org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could not
>>>> resolve ResourceManager address 
>>>> akka.tcp://flink@job-manager:6123/user/resourcemanager,
>>>> retrying in 10000 ms: Could not connect to rpc endpoint under address
>>>> akka.tcp://flink@job-manager:6123/user/resourcemanager
>>>>
>>>> It then does this for about 10 minutes(?), and then shuts down. If I'm
>>>> deploying a new cluster, this pod will try to register itself with the new
>>>> job manager before terminating lter. This isn't a troubling issue as far as
>>>> I can tell, but I find it annoying that I sometimes have to force delete
>>>> the pods.
>>>>
>>>> Any easy ways to just have the task managers terminate gracefully and
>>>> quickly?
>>>>
>>>> Thanks,
>>>> Li
>>>>
>>>

Reply via email to