Anyone knows why? I resumed the building job, but it's still blocked in "build dimension dictionary" step. the following log keep showing, it seems kylin tries to connect to zookeeper server one by one:
2016-08-18 14:37:39,648 INFO [Thread-10-SendThread(bigdata-5:2181)] zookeeper.ClientCnxn:1096 : Client session timed out, have not heard from server in 27230ms for sessionid 0x15698315e3d001d, closing socket connection and attempting reconnect 2016-08-18 14:37:39,749 INFO [Thread-10-EventThread] state.ConnectionStateManager:228 : State change: SUSPENDED 2016-08-18 14:37:39,851 INFO [Thread-10-SendThread(bigdata-master:2181)] zookeeper.ClientCnxn:975 : Opening socket connection to server bigdata-master/192.168.0.10:2181. Will not attempt to authenticate using SASL (unknown error) 2016-08-18 14:37:39,851 INFO [Thread-10-SendThread(bigdata-master:2181)] zookeeper.ClientCnxn:852 : Socket connection established to bigdata-master/192.168.0.10:2181, initiating session 2016-08-18 14:37:39,855 INFO [Thread-10-SendThread(bigdata-master:2181)] zookeeper.ClientCnxn:1235 : Session establishment complete on server bigdata-master/192.168.0.10:2181, sessionid = 0x15698315e3d001d, negotiated timeout = 40000 2016-08-18 14:37:39,856 INFO [Thread-10-EventThread] state.ConnectionStateManager:228 : State change: RECONNECTED 2016-08-18 14:38:10,290 INFO [Thread-10-SendThread(bigdata-master:2181)] zookeeper.ClientCnxn:1096 : Client session timed out, have not heard from server in 30431ms for sessionid 0x15698315e3d001d, closing socket connection and attempting reconnect Thanks, Minghao Feng ________________________________ From: ?? ? <[email protected]> Sent: Thursday, August 18, 2016 1:09:23 PM To: [email protected] Subject: Kylin lost connection to zookeeper and keep on reconnect and fail in "Build Dimension Dictionary" step Hi, Send here for help too. I'm a beginner of Kylin. I encountered a problem when building a cube which blocked me in "Build Dimension Dictionary" step. Here is the log: 2016-08-18 10:42:49,471 INFO [pool-8-thread-1] threadpool.DefaultScheduler:109 : Job Fetcher: 1 running, 1 actual running, 0 ready, 102 others 2016-08-18 10:42:49,619 INFO [Thread-10-SendThread(bigdata-master:2181)] zookeeper.ClientCnxn:975 : Opening socket connection to server bigdata-master/192.168.0.10:2181. Will not attempt to authen ticate using SASL (unknown error) 2016-08-18 10:42:49,620 INFO [Thread-10-SendThread(bigdata-master:2181)] zookeeper.ClientCnxn:852 : Socket connection established to bigdata-master/192.168.0.10:2181, initiating session 2016-08-18 10:42:49,630 INFO [Thread-10-SendThread(bigdata-master:2181)] zookeeper.ClientCnxn:1235 : Session establishment complete on server bigdata-master/192.168.0.10:2181, sessionid = 0x55697f f17be0018, negotiated timeout = 40000 2016-08-18 10:42:49,632 INFO [Thread-10-EventThread] state.ConnectionStateManager:228 : State change: RECONNECTED 2016-08-18 10:43:11,761 INFO [pool-9-thread-10-SendThread(bigdata-5:2181)] zookeeper.ClientCnxn:975 : Opening socket connection to server bigdata-5/192.168.0.48:2181. Will not attempt to authentic ate using SASL (unknown error) 2016-08-18 10:43:11,767 INFO [pool-9-thread-10-SendThread(bigdata-5:2181)] zookeeper.ClientCnxn:852 : Socket connection established to bigdata-5/192.168.0.48:2181, initiating session 2016-08-18 10:43:11,775 INFO [pool-9-thread-10-SendThread(bigdata-5:2181)] zookeeper.ClientCnxn:1094 : Unable to reconnect to ZooKeeper service, session 0x2569b259b110000 has expired, closing sock et connection 2016-08-18 10:43:11,840 WARN [pool-9-thread-10-EventThread] client.ConnectionManager$HConnectionImplementation:2371 : This client just lost it's session with ZooKeeper, closing it. It will be recr eated next time someone needs it org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:700) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:611) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) 2016-08-18 10:43:11,854 INFO [pool-9-thread-10-EventThread] client.ConnectionManager$HConnectionImplementation:1710 : Closing zookeeper sessionid=0x2569b259b110000 2016-08-18 10:43:11,856 INFO [pool-9-thread-10-EventThread] zookeeper.ClientCnxn:512 : EventThread shut down 2016-08-18 10:43:39,875 INFO [localhost-startStop-1-SendThread(bigdata-2:2181)] zookeeper.ClientCnxn:1096 : Client session timed out, have not heard from server in 28107ms for sessionid 0x356980a3 de8000c, closing socket connection and attempting reconnect 2016-08-18 10:43:39,875 INFO [Thread-10-SendThread(bigdata-master:2181)] zookeeper.ClientCnxn:1096 : Client session timed out, have not heard from server in 28113ms for sessionid 0x55697ff17be0018 , closing socket connection and attempting reconnect 2016-08-18 10:43:39,976 INFO [Thread-10-EventThread] state.ConnectionStateManager:228 : State change: SUSPENDED Then Kylin keep on trying to reconnect but fail: 2016-08-18 10:45:37,817 ERROR [Curator-Framework-0] curator.ConnectionState:200 : Connection timed out for connection string (bigdata-master:2181,bigdata-2:2181,bigdata-3:2181,bigdata-4:2181,bigdat a-5:2181) and timeout (15000) / elapsed (27967) org.apache.curator.CuratorConnectionLossException: KeeperErrorCode = ConnectionLoss at org.apache.curator.ConnectionState.checkTimeouts(ConnectionState.java:197) at org.apache.curator.ConnectionState.getZooKeeper(ConnectionState.java:87) at org.apache.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:115) at org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:806) at org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:792) at org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:62) at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:257) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 2016-08-18 10:45:37,821 INFO [Thread-10-SendThread(bigdata-5:2181)] zookeeper.ClientCnxn:975 : Opening socket connection to server bigdata-5/192.168.0.48:2181. Will not attempt to authenticate usi ng SASL (unknown error) Kylin also stoped after several times. When I restart kylin, it still trying and failing. I've checked zookeeper, seems no problem during this period. Does anyone know what happened and how can I fix the problem? This is my environment: Hadoop 2.7.2 Spark 1.6.2 Hbase 1.2.2 Zookeeper 3.4.6 Hive 2.1.0 Kylin 1.5.3 Thanks, Minghao Feng
