It should be roughly the same settings that you use in your JobManager. They are described here: https://ci.apache.org/projects/flink/flink-docs-master/ops/config.html#zookeeper-based-ha-mode
> On 14. Feb 2018, at 15:32, Chirag Dewan <chirag.dewa...@yahoo.in> wrote: > > Thanks Aljoscha. > > I haven't checked that bit. Is there any configuration for TaskManagers to > find ZK? > > Regards, > > Chirag > > Sent from Yahoo Mail on Android > <https://overview.mail.yahoo.com/mobile/?.src=Android> > On Wed, 14 Feb 2018 at 7:43 PM, Aljoscha Krettek > <aljos...@apache.org> wrote: > Do you see in the logs whether the TaskManager correctly connect to ZooKeeper > as well? They need this in order to find the JobManager leader. > > Best, > Aljoscha > >> On 14. Feb 2018, at 06:12, Chirag Dewan <chirag.dewa...@yahoo.in >> <mailto:chirag.dewa...@yahoo.in>> wrote: >> >> Hi, >> >> I am trying to deploy a Flink cluster (1 JM, 2TM) on a Docker Swarm. For >> JobManager HA, I have started a 3 node zookeeper service on the same swarm >> network and configured Flink's zookeeper quorum with zookeeper service >> instances. >> >> JobManager gets started with the LeaderElectionService and gets assigned a >> LeaderSessionID too, which I can see from the following log >> statements(attaching only related logs) : >> >> org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService - >> Starting ZooKeeperLeaderElectionService >> org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - >> Starting ZooKeeperLeaderRetrievalService. >> org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - >> Starting ZooKeeperLeaderRetrievalService. >> JobManager akka.tcp://flink@jobmanager:6123/user/jobmanager <> was granted >> leadership with leader session ID Some(1f3b2ec6-77b6-4532-928f-ad8befd5202f). >> Trying to associate with JobManager leader >> akka.tcp://flink@jobmanager:6123/user/jobmanager <> >> Resource Manager associating with leading JobManager >> Actor[akka://flink/user/jobmanager#590681231 <>] - leader session >> 1f3b2ec6-77b6-4532-928f-ad8befd5202f >> >> >> But TaskManagers are not able to register with the JobManager and gives the >> following error: >> >> Discard message >> LeaderSessionMessage(00000000-0000-0000-0000-000000000000,RegisterTaskManager(4fc8aceeae1e27e42b9f16df6c0cf5e3,4fc8aceeae1e27e42b9f16df6c0cf5e3 >> @ a118cdf39114 (dataPort=43017),cores=1, physMem=1044111360, >> heap=536870912, managed=324208384,1)) because the expected leader session ID >> 1f3b2ec6-77b6-4532-928f-ad8befd5202f did not equal the received leader >> session ID 00000000-0000-0000-0000-000000000000. >> >> Seems like the ResourceManager was not able to retrieve the LeaderSessionID >> and passed 00 ID. >> >> One interesting thing I observed was a ZK version log: >> >> The version of ZooKeeper being used doesn't support Container nodes. >> CreateMode.PERSISTENT will be used instead. >> >> Is this a ZK version problem? Should I be using ZK 3.4.6? >> >> My configuration: >> >> Flink Version : 1.4.0 >> ZK version : 3.4.11 (I just pulled the latest image) >> >> Thanks in advance. >> >> Chirag >> >