Hi Renkai, it seems to me as if the TM lost its network connection somehow. Therefore, the JM's heartbeat won't get answered and it marks the TM as terminated. This would also explain why the TM can not longer talk to ZooKeeper.
Is this problem reproducible? If so, could you share the full logs with us? Cheers, Till On Fri, Nov 25, 2016 at 5:12 AM, Renkai <gaelook...@gmail.com> wrote: > some additional logs I found in jobManager. > > 2016-11-25 07:19:57,958 WARN akka.remote.RemoteWatcher > - Detected unreachable: [akka.tcp://flink@10.17.123.56:59247] > 2016-11-25 07:19:57,962 INFO org.apache.flink.runtime. > jobmanager.JobManager > - Task manager akka.tcp://flink@10.17.123.56:59247/user/taskmanager > terminated. > > > > -- > View this message in context: http://apache-flink-user- > mailing-list-archive.2336050.n4.nabble.com/JobManager- > shows-TaskManager-was-lost-killed-while-TaskManger- > Process-is-still-running-and-the-netwo-tp10329p10330.html > Sent from the Apache Flink User Mailing List archive. mailing list archive > at Nabble.com. >