Hi Samir, could you share the logs of the two JMs and the log where you saw the FencingTokenException with us?
It looks to me as if the TM had an outdated fencing token (an outdated leader session id) with which it contacted the ResourceManager. This can happen and the TM should try to reconnect to the RM once it learns about the new leader session id via ZooKeeper. You could, for example check in ZooKeeper that it contains the valid leader information. Cheers, Till On Fri, Oct 5, 2018 at 9:58 AM Samir Tusharbhai Chauhan < samir.tusharbhai.chau...@prudential.com.sg> wrote: > Hi, > > > > I am having issue in setting up cluster for Flink. I have 2 nodes for Job > Manager and 2 nodes for Task Manager. > > > > My configuration file looks like this. > > > > jobmanager.rpc.port: 6123 > > jobmanager.heap.size: 2048m > > taskmanager.heap.size: 2048m > > taskmanager.numberOfTaskSlots: 64 > > parallelism.default: 1 > > rest.port: 8081 > > high-availability.jobmanager.port: 50010 > > high-availability: zookeeper > > high-availability.storageDir: file:///sharedflink/state_dir/ha/ > > high-availability.zookeeper.quorum: host1:2181,host2:2181,host3:2181 > > high-availability.zookeeper.path.root: /flink > > high-availability.cluster-id: /flick_ns > > > > state.backend: rocksdb > > state.checkpoints.dir: file:///sharedflink/state_dir/backend > > state.savepoints.dir: file:///sharedflink/state_dir/savepoint > > state.backend.incremental: false > > state.backend.rocksdb.timer-service.factory: rocksdb > > state.backend.local-recovery: false > > > > But when I start services, I get this error message. > > > > java.util.concurrent.CompletionException: > > org.apache.flink.runtime.rpc.exceptions.FencingTokenException: Fencing > token > > mismatch: Ignoring message > > RemoteFencedMessage(b00185a18ea3da17ebe39ac411a84f3a, > > RemoteRpcInvocation(registerTaskExecutor(String, ResourceID, int, > HardwareDescription, Time))) because the fencing token > b00185a18ea3da17ebe39ac411a84f3a did not match the expected fencing token > bce1729df0a2ab8a7ea0426ba9994482. > > at > > > java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292) > > > > > > But when I run JM and TM in single box, it is working fine. > > > > Please help to resolve this issue ASAP as I am running out of option and > time. > > > > -Samir Chauhan > > > > > > There's a reason we support Fair Dealing. YOU. > > > This email and any files transmitted with it or attached to it (the > [Email]) may contain confidential, proprietary or legally privileged > information and is intended solely for the use of the individual or entity > to whom it is addressed. If you are not the intended recipient of the > Email, you must not, directly or indirectly, copy, use, print, distribute, > disclose to any other party or take any action in reliance on any part of > the Email. Please notify the system manager or sender of the error and > delete all copies of the Email immediately. > > No statement in the Email should be construed as investment advice being > given within or outside Singapore. Prudential Assurance Company Singapore > (Pte) Limited (PACS) and each of its related entities shall not be > responsible for any losses, claims, penalties, costs or damages arising > from or in connection with the use of the Email or the information therein, > in whole or in part. You are solely responsible for conducting any virus > checks prior to opening, accessing or disseminating the Email. > > PACS (Company Registration No. 199002477Z) is a company incorporated under > the laws of Singapore and has its registered office at 30 Cecil Street, > #30-01, Prudential Tower, Singapore 049712. > > PACS is an indirect wholly owned subsidiary of Prudential plc of the > United Kingdom. PACS and Prudential plc are not affiliated in any manner > with Prudential Financial, Inc., a company whose principal place of > business is in the United States of America. >