Another possibility is the JM is killed externally, e.g. K8s may kill JM/TM if it exceeds the resource limit.
Thanks, Zhu Zhu Zhu Zhu <reed...@gmail.com> 于2019年8月12日周一 下午1:45写道: > Hi Cam, > > Flink master should not die when getting disconnected with task managers. > It may exit for cases below: > 1. when the job terminated(FINISHED/FAILED/CANCELED). If you job is > configured with no restart retry, a TM failure can cause the job to be > FAILED. > 2. JM lost HA leadership, e.g. lost connection to ZK > 3. encounters other unexpected fatal errors. In this case we need to check > the log to see what happens then > > Thanks, > Zhu Zhu > > Cam Mach <cammac...@gmail.com> 于2019年8月12日周一 下午12:15写道: > >> Hello Flink experts, >> >> We are running Flink under Kubernetes and see that Job Manager >> die/restarted whenever Task Manager die/restarted or couldn't get connected >> each other. Is there any specific configurations/parameters that we need to >> turn on to stop this? Or this is expected? >> >> Thanks, >> Cam >> >>