Just found that the original hive table have several dimension with very high 
cardinality, maybe this caused the problem.
I'm trying to reduce the cardinality, will get you posted later.

Thanks,
Minghao Feng
________________________________
From: 明浩 冯 <[email protected]>
Sent: Thursday, August 18, 2016 8:07:01 PM
To: [email protected]
Subject: Re: Kylin lost connection to zookeeper and keep on reconnect and fail 
in "Build Dimension Dictionary" step

Hi Yang,

sorry for the confused title, I checked the log again and I think Kylin can 
connect to zookeeper, but I don't know why the connection timed out so Kylin 
try to reconnect another zookeeper server. Do you know some possible reason?

Thanks,
Minghao Feng

发自我的 iPhone

在 2016年8月18日,下午7:15,Li Yang <[email protected]<mailto:[email protected]>> 写道:

Cannot connect zookeeper is env problem. You knows your env better than anyone 
else, I guess.

On Thu, Aug 18, 2016 at 2:41 PM, 明浩 冯 
<[email protected]<mailto:[email protected]>> wrote:

Anyone knows why?

I resumed the building job, but it's still blocked in "build dimension 
dictionary" step. the following log keep showing, it seems kylin tries to 
connect to zookeeper server one by one:


2016-08-18 14:37:39,648 INFO  [Thread-10-SendThread(bigdata-5:2181)] 
zookeeper.ClientCnxn:1096 : Client session timed out, have not heard from 
server in 27230ms for sessionid 0x15698315e3d001d, closing socket connection 
and attempting reconnect
2016-08-18 14:37:39,749 INFO  [Thread-10-EventThread] 
state.ConnectionStateManager:228 : State change: SUSPENDED
2016-08-18 14:37:39,851 INFO  [Thread-10-SendThread(bigdata-master:2181)] 
zookeeper.ClientCnxn:975 : Opening socket connection to server 
bigdata-master/192.168.0.10:2181<http://192.168.0.10:2181>. Will not attempt to 
authenticate using SASL (unknown error)
2016-08-18 14:37:39,851 INFO  [Thread-10-SendThread(bigdata-master:2181)] 
zookeeper.ClientCnxn:852 : Socket connection established to 
bigdata-master/192.168.0.10:2181<http://192.168.0.10:2181>, initiating session
2016-08-18 14:37:39,855 INFO  [Thread-10-SendThread(bigdata-master:2181)] 
zookeeper.ClientCnxn:1235 : Session establishment complete on server 
bigdata-master/192.168.0.10:2181<http://192.168.0.10:2181>, sessionid = 
0x15698315e3d001d, negotiated timeout = 40000
2016-08-18 14:37:39,856 INFO  [Thread-10-EventThread] 
state.ConnectionStateManager:228 : State change: RECONNECTED
2016-08-18 14:38:10,290 INFO  [Thread-10-SendThread(bigdata-master:2181)] 
zookeeper.ClientCnxn:1096 : Client session timed out, have not heard from 
server in 30431ms for sessionid 0x15698315e3d001d, closing socket connection 
and attempting reconnect


Thanks,

Minghao Feng

________________________________
From: ?? ? <[email protected]<mailto:[email protected]>>
Sent: Thursday, August 18, 2016 1:09:23 PM
To: [email protected]<mailto:[email protected]>
Subject: Kylin lost connection to zookeeper and keep on reconnect and fail in 
"Build Dimension Dictionary" step


Hi,

Send here for help too.
I'm a beginner of Kylin. I encountered a problem when building a cube which 
blocked me in "Build Dimension Dictionary" step. Here is the log:

2016-08-18 10:42:49,471 INFO  [pool-8-thread-1] threadpool.DefaultScheduler:109 
: Job Fetcher: 1 running, 1 actual running, 0 ready, 102 others
2016-08-18 10:42:49,619 INFO  [Thread-10-SendThread(bigdata-master:2181)] 
zookeeper.ClientCnxn:975 : Opening socket connection to server 
bigdata-master/192.168.0.10:2181<http://192.168.0.10:2181>. Will not attempt to 
authen
ticate using SASL (unknown error)
2016-08-18 10:42:49,620 INFO  [Thread-10-SendThread(bigdata-master:2181)] 
zookeeper.ClientCnxn:852 : Socket connection established to 
bigdata-master/192.168.0.10:2181<http://192.168.0.10:2181>, initiating session
2016-08-18 10:42:49,630 INFO  [Thread-10-SendThread(bigdata-master:2181)] 
zookeeper.ClientCnxn:1235 : Session establishment complete on server 
bigdata-master/192.168.0.10:2181<http://192.168.0.10:2181>, sessionid = 0x55697f
f17be0018, negotiated timeout = 40000
2016-08-18 10:42:49,632 INFO  [Thread-10-EventThread] 
state.ConnectionStateManager:228 : State change: RECONNECTED
2016-08-18 10:43:11,761 INFO  [pool-9-thread-10-SendThread(bigdata-5:2181)] 
zookeeper.ClientCnxn:975 : Opening socket connection to server 
bigdata-5/192.168.0.48:2181<http://192.168.0.48:2181>. Will not attempt to 
authentic
ate using SASL (unknown error)
2016-08-18 10:43:11,767 INFO  [pool-9-thread-10-SendThread(bigdata-5:2181)] 
zookeeper.ClientCnxn:852 : Socket connection established to 
bigdata-5/192.168.0.48:2181<http://192.168.0.48:2181>, initiating session
2016-08-18 10:43:11,775 INFO  [pool-9-thread-10-SendThread(bigdata-5:2181)] 
zookeeper.ClientCnxn:1094 : Unable to reconnect to ZooKeeper service, session 
0x2569b259b110000 has expired, closing sock
et connection
2016-08-18 10:43:11,840 WARN  [pool-9-thread-10-EventThread] 
client.ConnectionManager$HConnectionImplementation:2371 : This client just lost 
it's session with ZooKeeper, closing it. It will be recr
eated next time someone needs it
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = 
Session expired
        at 
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:700)
        at 
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:611)
        at 
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
        at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
2016-08-18 10:43:11,854 INFO  [pool-9-thread-10-EventThread] 
client.ConnectionManager$HConnectionImplementation:1710 : Closing zookeeper 
sessionid=0x2569b259b110000
2016-08-18 10:43:11,856 INFO  [pool-9-thread-10-EventThread] 
zookeeper.ClientCnxn:512 : EventThread shut down
2016-08-18 10:43:39,875 INFO  
[localhost-startStop-1-SendThread(bigdata-2:2181)] zookeeper.ClientCnxn:1096 : 
Client session timed out, have not heard from server in 28107ms for sessionid 
0x356980a3
de8000c, closing socket connection and attempting reconnect
2016-08-18 10:43:39,875 INFO  [Thread-10-SendThread(bigdata-master:2181)] 
zookeeper.ClientCnxn:1096 : Client session timed out, have not heard from 
server in 28113ms for sessionid 0x55697ff17be0018
, closing socket connection and attempting reconnect
2016-08-18 10:43:39,976 INFO  [Thread-10-EventThread] 
state.ConnectionStateManager:228 : State change: SUSPENDED


Then Kylin keep on trying to reconnect but fail:

2016-08-18 10:45:37,817 ERROR [Curator-Framework-0] curator.ConnectionState:200 
: Connection timed out for connection string 
(bigdata-master:2181,bigdata-2:2181,bigdata-3:2181,bigdata-4:2181,bigdat
a-5:2181) and timeout (15000) / elapsed (27967)
org.apache.curator.CuratorConnectionLossException: KeeperErrorCode = 
ConnectionLoss
        at 
org.apache.curator.ConnectionState.checkTimeouts(ConnectionState.java:197)
        at 
org.apache.curator.ConnectionState.getZooKeeper(ConnectionState.java:87)
        at 
org.apache.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:115)
        at 
org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:806)
        at 
org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:792)
        at 
org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:62)
        at 
org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:257)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
2016-08-18 10:45:37,821 INFO  [Thread-10-SendThread(bigdata-5:2181)] 
zookeeper.ClientCnxn:975 : Opening socket connection to server 
bigdata-5/192.168.0.48:2181<http://192.168.0.48:2181>. Will not attempt to 
authenticate usi
ng SASL (unknown error)

Kylin also stoped after several times. When I restart kylin, it still trying 
and failing.
I've checked zookeeper, seems no problem during this period. Does anyone know 
what happened and how can I fix the problem?

This is my environment:
Hadoop 2.7.2
Spark 1.6.2
Hbase 1.2.2
Zookeeper 3.4.6
Hive 2.1.0
Kylin 1.5.3

Thanks,
Minghao Feng


Reply via email to