Just found that the original hive table have several dimension with very high cardinality, maybe this caused the problem. I'm trying to reduce the cardinality, will get you posted later.
Thanks, Minghao Feng ________________________________ From: 明浩 冯 <[email protected]> Sent: Thursday, August 18, 2016 8:07:01 PM To: [email protected] Subject: Re: Kylin lost connection to zookeeper and keep on reconnect and fail in "Build Dimension Dictionary" step Hi Yang, sorry for the confused title, I checked the log again and I think Kylin can connect to zookeeper, but I don't know why the connection timed out so Kylin try to reconnect another zookeeper server. Do you know some possible reason? Thanks, Minghao Feng 发自我的 iPhone 在 2016年8月18日,下午7:15,Li Yang <[email protected]<mailto:[email protected]>> 写道: Cannot connect zookeeper is env problem. You knows your env better than anyone else, I guess. On Thu, Aug 18, 2016 at 2:41 PM, 明浩 冯 <[email protected]<mailto:[email protected]>> wrote: Anyone knows why? I resumed the building job, but it's still blocked in "build dimension dictionary" step. the following log keep showing, it seems kylin tries to connect to zookeeper server one by one: 2016-08-18 14:37:39,648 INFO [Thread-10-SendThread(bigdata-5:2181)] zookeeper.ClientCnxn:1096 : Client session timed out, have not heard from server in 27230ms for sessionid 0x15698315e3d001d, closing socket connection and attempting reconnect 2016-08-18 14:37:39,749 INFO [Thread-10-EventThread] state.ConnectionStateManager:228 : State change: SUSPENDED 2016-08-18 14:37:39,851 INFO [Thread-10-SendThread(bigdata-master:2181)] zookeeper.ClientCnxn:975 : Opening socket connection to server bigdata-master/192.168.0.10:2181<http://192.168.0.10:2181>. Will not attempt to authenticate using SASL (unknown error) 2016-08-18 14:37:39,851 INFO [Thread-10-SendThread(bigdata-master:2181)] zookeeper.ClientCnxn:852 : Socket connection established to bigdata-master/192.168.0.10:2181<http://192.168.0.10:2181>, initiating session 2016-08-18 14:37:39,855 INFO [Thread-10-SendThread(bigdata-master:2181)] zookeeper.ClientCnxn:1235 : Session establishment complete on server bigdata-master/192.168.0.10:2181<http://192.168.0.10:2181>, sessionid = 0x15698315e3d001d, negotiated timeout = 40000 2016-08-18 14:37:39,856 INFO [Thread-10-EventThread] state.ConnectionStateManager:228 : State change: RECONNECTED 2016-08-18 14:38:10,290 INFO [Thread-10-SendThread(bigdata-master:2181)] zookeeper.ClientCnxn:1096 : Client session timed out, have not heard from server in 30431ms for sessionid 0x15698315e3d001d, closing socket connection and attempting reconnect Thanks, Minghao Feng ________________________________ From: ?? ? <[email protected]<mailto:[email protected]>> Sent: Thursday, August 18, 2016 1:09:23 PM To: [email protected]<mailto:[email protected]> Subject: Kylin lost connection to zookeeper and keep on reconnect and fail in "Build Dimension Dictionary" step Hi, Send here for help too. I'm a beginner of Kylin. I encountered a problem when building a cube which blocked me in "Build Dimension Dictionary" step. Here is the log: 2016-08-18 10:42:49,471 INFO [pool-8-thread-1] threadpool.DefaultScheduler:109 : Job Fetcher: 1 running, 1 actual running, 0 ready, 102 others 2016-08-18 10:42:49,619 INFO [Thread-10-SendThread(bigdata-master:2181)] zookeeper.ClientCnxn:975 : Opening socket connection to server bigdata-master/192.168.0.10:2181<http://192.168.0.10:2181>. Will not attempt to authen ticate using SASL (unknown error) 2016-08-18 10:42:49,620 INFO [Thread-10-SendThread(bigdata-master:2181)] zookeeper.ClientCnxn:852 : Socket connection established to bigdata-master/192.168.0.10:2181<http://192.168.0.10:2181>, initiating session 2016-08-18 10:42:49,630 INFO [Thread-10-SendThread(bigdata-master:2181)] zookeeper.ClientCnxn:1235 : Session establishment complete on server bigdata-master/192.168.0.10:2181<http://192.168.0.10:2181>, sessionid = 0x55697f f17be0018, negotiated timeout = 40000 2016-08-18 10:42:49,632 INFO [Thread-10-EventThread] state.ConnectionStateManager:228 : State change: RECONNECTED 2016-08-18 10:43:11,761 INFO [pool-9-thread-10-SendThread(bigdata-5:2181)] zookeeper.ClientCnxn:975 : Opening socket connection to server bigdata-5/192.168.0.48:2181<http://192.168.0.48:2181>. Will not attempt to authentic ate using SASL (unknown error) 2016-08-18 10:43:11,767 INFO [pool-9-thread-10-SendThread(bigdata-5:2181)] zookeeper.ClientCnxn:852 : Socket connection established to bigdata-5/192.168.0.48:2181<http://192.168.0.48:2181>, initiating session 2016-08-18 10:43:11,775 INFO [pool-9-thread-10-SendThread(bigdata-5:2181)] zookeeper.ClientCnxn:1094 : Unable to reconnect to ZooKeeper service, session 0x2569b259b110000 has expired, closing sock et connection 2016-08-18 10:43:11,840 WARN [pool-9-thread-10-EventThread] client.ConnectionManager$HConnectionImplementation:2371 : This client just lost it's session with ZooKeeper, closing it. It will be recr eated next time someone needs it org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:700) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:611) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) 2016-08-18 10:43:11,854 INFO [pool-9-thread-10-EventThread] client.ConnectionManager$HConnectionImplementation:1710 : Closing zookeeper sessionid=0x2569b259b110000 2016-08-18 10:43:11,856 INFO [pool-9-thread-10-EventThread] zookeeper.ClientCnxn:512 : EventThread shut down 2016-08-18 10:43:39,875 INFO [localhost-startStop-1-SendThread(bigdata-2:2181)] zookeeper.ClientCnxn:1096 : Client session timed out, have not heard from server in 28107ms for sessionid 0x356980a3 de8000c, closing socket connection and attempting reconnect 2016-08-18 10:43:39,875 INFO [Thread-10-SendThread(bigdata-master:2181)] zookeeper.ClientCnxn:1096 : Client session timed out, have not heard from server in 28113ms for sessionid 0x55697ff17be0018 , closing socket connection and attempting reconnect 2016-08-18 10:43:39,976 INFO [Thread-10-EventThread] state.ConnectionStateManager:228 : State change: SUSPENDED Then Kylin keep on trying to reconnect but fail: 2016-08-18 10:45:37,817 ERROR [Curator-Framework-0] curator.ConnectionState:200 : Connection timed out for connection string (bigdata-master:2181,bigdata-2:2181,bigdata-3:2181,bigdata-4:2181,bigdat a-5:2181) and timeout (15000) / elapsed (27967) org.apache.curator.CuratorConnectionLossException: KeeperErrorCode = ConnectionLoss at org.apache.curator.ConnectionState.checkTimeouts(ConnectionState.java:197) at org.apache.curator.ConnectionState.getZooKeeper(ConnectionState.java:87) at org.apache.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:115) at org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:806) at org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:792) at org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:62) at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:257) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 2016-08-18 10:45:37,821 INFO [Thread-10-SendThread(bigdata-5:2181)] zookeeper.ClientCnxn:975 : Opening socket connection to server bigdata-5/192.168.0.48:2181<http://192.168.0.48:2181>. Will not attempt to authenticate usi ng SASL (unknown error) Kylin also stoped after several times. When I restart kylin, it still trying and failing. I've checked zookeeper, seems no problem during this period. Does anyone know what happened and how can I fix the problem? This is my environment: Hadoop 2.7.2 Spark 1.6.2 Hbase 1.2.2 Zookeeper 3.4.6 Hive 2.1.0 Kylin 1.5.3 Thanks, Minghao Feng
