It should be roughly the same settings that you use in your JobManager. They 
are described here: 
https://ci.apache.org/projects/flink/flink-docs-master/ops/config.html#zookeeper-based-ha-mode

> On 14. Feb 2018, at 15:32, Chirag Dewan <chirag.dewa...@yahoo.in> wrote:
> 
> Thanks Aljoscha.
> 
> I haven't checked that bit. Is there any configuration for TaskManagers to 
> find ZK?
> 
> Regards,
> 
> Chirag
> 
> Sent from Yahoo Mail on Android 
> <https://overview.mail.yahoo.com/mobile/?.src=Android>
> On Wed, 14 Feb 2018 at 7:43 PM, Aljoscha Krettek
> <aljos...@apache.org> wrote:
> Do you see in the logs whether the TaskManager correctly connect to ZooKeeper 
> as well? They need this in order to find the JobManager leader.
> 
> Best,
> Aljoscha
> 
>> On 14. Feb 2018, at 06:12, Chirag Dewan <chirag.dewa...@yahoo.in 
>> <mailto:chirag.dewa...@yahoo.in>> wrote:
>> 
>> Hi,
>> 
>> I am trying to deploy a Flink cluster (1 JM, 2TM) on a Docker Swarm. For 
>> JobManager HA, I have started a 3 node zookeeper service on the same swarm 
>> network and configured Flink's zookeeper quorum with zookeeper service 
>> instances. 
>> 
>> JobManager gets started with the LeaderElectionService and gets assigned a 
>> LeaderSessionID too, which I can see from the following log 
>> statements(attaching only related logs) :
>> 
>> org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService  - 
>> Starting ZooKeeperLeaderElectionService   
>> org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService  - 
>> Starting ZooKeeperLeaderRetrievalService.
>> org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService  - 
>> Starting ZooKeeperLeaderRetrievalService.
>> JobManager akka.tcp://flink@jobmanager:6123/user/jobmanager <> was granted 
>> leadership with leader session ID Some(1f3b2ec6-77b6-4532-928f-ad8befd5202f).
>>  Trying to associate with JobManager leader 
>> akka.tcp://flink@jobmanager:6123/user/jobmanager <>
>>  Resource Manager associating with leading JobManager 
>> Actor[akka://flink/user/jobmanager#590681231 <>] - leader session 
>> 1f3b2ec6-77b6-4532-928f-ad8befd5202f
>> 
>> 
>> But TaskManagers are not able to register with the JobManager and gives the 
>> following error:
>> 
>> Discard message 
>> LeaderSessionMessage(00000000-0000-0000-0000-000000000000,RegisterTaskManager(4fc8aceeae1e27e42b9f16df6c0cf5e3,4fc8aceeae1e27e42b9f16df6c0cf5e3
>>  @ a118cdf39114 (dataPort=43017),cores=1, physMem=1044111360, 
>> heap=536870912, managed=324208384,1)) because the expected leader session ID 
>> 1f3b2ec6-77b6-4532-928f-ad8befd5202f did not equal the received leader 
>> session ID 00000000-0000-0000-0000-000000000000.
>> 
>> Seems like the ResourceManager was not able to retrieve the LeaderSessionID 
>> and passed 00 ID. 
>> 
>> One interesting thing I observed was a ZK version log:
>> 
>> The version of ZooKeeper being used doesn't support Container nodes. 
>> CreateMode.PERSISTENT will be used instead.
>> 
>> Is this a ZK version problem? Should I be using ZK 3.4.6?
>> 
>> My configuration:
>> 
>> Flink Version : 1.4.0
>> ZK version : 3.4.11 (I just pulled the latest image)
>> 
>> Thanks in advance. 
>> 
>> Chirag
>> 
> 

Reply via email to