Hi Zhu,

Look like it's expected. Those are the cases that are happened to our
cluster.

Thanks for your response, Zhu

Cam



On Sun, Aug 11, 2019 at 10:53 PM Zhu Zhu <reed...@gmail.com> wrote:

> Another possibility is the JM is killed externally, e.g. K8s may kill
> JM/TM if it exceeds the resource limit.
>
> Thanks,
> Zhu Zhu
>
> Zhu Zhu <reed...@gmail.com> 于2019年8月12日周一 下午1:45写道:
>
>> Hi Cam,
>>
>> Flink master should not die when getting disconnected with task managers.
>> It may exit for cases below:
>> 1. when the job terminated(FINISHED/FAILED/CANCELED). If you job is
>> configured with no restart retry, a TM failure can cause the job to be
>> FAILED.
>> 2. JM lost HA leadership, e.g. lost connection to ZK
>> 3. encounters other unexpected fatal errors. In this case we need to
>> check the log to see what happens then
>>
>> Thanks,
>> Zhu Zhu
>>
>> Cam Mach <cammac...@gmail.com> 于2019年8月12日周一 下午12:15写道:
>>
>>> Hello Flink experts,
>>>
>>> We are running Flink under Kubernetes and see that Job Manager
>>> die/restarted whenever Task Manager die/restarted or couldn't get connected
>>> each other. Is there any specific configurations/parameters that we need to
>>> turn on to stop this? Or this is expected?
>>>
>>> Thanks,
>>> Cam
>>>
>>>

Reply via email to