Hi Sathya, have you checked this yet? https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/ jobmanager_high_availability.html
I'm no expert on the HA setup, have you also tried Flink 1.3 just in case? Nico On Wednesday, 8 November 2017 04:02:47 CET Sathya Hariesh Prakash (sathypra) wrote: > Hi – We’re currently testing Flink HA and running into a zookeeper timeout > issue. Error log below. > Is there a production checklist or any information on parameters that are > related to flink HA that I need to pay attention to? > Any pointers would really help. Please let me know if any additional > information is needed. Thanks! > NOTE: I see multiple connection timeout messages. With different elapsed > times. > { > "timeMillis":1510095254557, > "thread":"Curator-Framework-0", > "level":"ERROR", > > "loggerName":"org.apache.flink.shaded.org.apache.curator.ConnectionState", > "message":"Connection timed out for connection string > (zookeeper.system.svc.cluster.local:2181) and timeout (15000) / elapsed > (15004)", "thrown":{ > "commonElementCount":0, > "localizedMessage":"KeeperErrorCode = ConnectionLoss", > "message":"KeeperErrorCode = ConnectionLoss", > > "name":"org.apache.flink.shaded.org.apache.curator.CuratorConnectionLossExc > eption", "extendedStackTrace":[ > { > > "class":"org.apache.flink.shaded.org.apache.curator.ConnectionState", > "method":"checkTimeouts", > "file":"ConnectionState.java", > "line":197, > "exact":true, > "location":"flink-runtime_2.10-1.2.jar", > "version":"1.2" > }, > { > > "class":"org.apache.flink.shaded.org.apache.curator.ConnectionState", > "method":"getZooKeeper", > "file":"ConnectionState.java", > "line":87, > "exact":true, > "location":"flink-runtime_2.10-1.2.jar", > "version":"1.2" > }, > { > > "class":"org.apache.flink.shaded.org.apache.curator.CuratorZookeeperClient" > , "method":"getZooKeeper", > "file":"CuratorZookeeperClient.java", > "line":115, > "exact":true, > "location":"flink-runtime_2.10-1.2.jar", > "version":"1.2" > }, > { > > "class":"org.apache.flink.shaded.org.apache.curator.framework.imps.CuratorF > rameworkImpl", "method":"performBackgroundOperation", > "file":"CuratorFrameworkImpl.java", > "line":806, > "exact":true, > "location":"flink-runtime_2.10-1.2.jar", > "version":"1.2" > }, > { > > "class":"org.apache.flink.shaded.org.apache.curator.framework.imps.CuratorF > rameworkImpl", "method":"backgroundOperationsLoop", > "file":"CuratorFrameworkImpl.java", > "line":792, > "exact":true, > "location":"flink-runtime_2.10-1.2.jar", > "version":"1.2" > }, > { > > "class":"org.apache.flink.shaded.org.apache.curator.framework.imps.CuratorF > rameworkImpl", "method":"access$300", > "file":"CuratorFrameworkImpl.java", > "line":62, > "exact":true, > "location":"flink-runtime_2.10-1.2.jar", > "version":"1.2" > }, > { > > "class":"org.apache.flink.shaded.org.apache.curator.framework.imps.CuratorF > rameworkImpl$4", "method":"call", > "file":"CuratorFrameworkImpl.java", > "line":257, > "exact":true, > "location":"flink-runtime_2.10-1.2.jar", > "version":"1.2" > }, > { > "class":"java.util.concurrent.FutureTask", > "method":"run", > "file":"FutureTask.java", > "line":266, > "exact":true, > "location":"?", > "version":"1.8.0_66" > }, > { > "class":"java.util.concurrent.ThreadPoolExecutor", > "method":"runWorker", > "file":"ThreadPoolExecutor.java", > "line":1142, > "exact":true, > "location":"?", > "version":"1.8.0_66" > }, > { > "class":"java.util.concurrent.ThreadPoolExecutor$Worker", > "method":"run", > "file":"ThreadPoolExecutor.java", > "line":617, > "exact":true, > "location":"?", > "version":"1.8.0_66" > }, > { > "class":"java.lang.Thread", > "method":"run", > "file":"Thread.java", > "line":745, > "exact":true, > "location":"?", > "version":"1.8.0_66" > } > ] > }, > "endOfBatch":false, > "loggerFqcn":"org.apache.logging.slf4j.Log4jLogger", > "threadId":258, > "threadPriority":5 > }
signature.asc
Description: This is a digitally signed message part.