[ 
https://issues.apache.org/jira/browse/KAFKA-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sadek updated KAFKA-3004:
-------------------------
    Description: 
While doing load testing we have noticed that the controller will fail over 
almost every hour with the following entry on its log:

INFO [SessionExpirationListener on 4], ZK expired; shut down all controller 
components and try to re-elect 
(kafka.controller.KafkaController$SessionExpirationListener)

I also see an increase in minor-GC collection around the same time.

2015-12-17T22:00:40.961+0000: 15693.112: [GC2015-12-17T22:00:46.404+0000: 
15698.554: [ParNew: 282865K->3922K(314560K), 0.0104700 secs] 
576345K->297570K(1013632K), 5.4531250 secs] [Times: user=0.05 sys=0.00, 
real=5.46 secs]

Here's a snippet of the broker log around that time

[2015-12-17 22:00:36,090] INFO zookeeper state changed (SyncConnected) 
(org.I0Itec.zkclient.ZkClient)
15754934 [main-SendThread(kfk02.local:2182)] INFO 
org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard from 
server in 12203ms for sessionid 0x151b10503e60002, closing socket connection 
and attempting reconnect
[2015-12-17 22:01:55,533] INFO zookeeper state changed (Disconnected) 
(org.I0Itec.zkclient.ZkClient)
15755399 [main-SendThread(kfk01.local:2182)] INFO 
org.apache.zookeeper.ClientCnxn - Opening socket connection to server 
kfk01.local/10.124.80.140:2182. Will not attempt to authenticate using SASL 
(unknown error)
15755400 [main-SendThread(kfk01.local:2182)] INFO 
org.apache.zookeeper.ClientCnxn - Socket connection established to 
kfk01.local/10.124.80.140:2182, initiating session
15755401 [main-SendThread(kfk01.local:2182)] INFO 
org.apache.zookeeper.ClientCnxn - Session establishment complete on server 
kfk01.local/10.124.80.140:2182, sessionid = 0x151b10503e60002, negotiated 
timeout = 12000
[2015-12-17 22:01:55,902] INFO zookeeper state changed (SyncConnected) 
(org.I0Itec.zkclient.ZkClient)

Any idea what may be causing this?


Thanks!

  was:
While doing load testing we have noticed that the controller will fail over 
almost every hour with the following entry on its log:

INFO [SessionExpirationListener on 4], ZK expired; shut down all controller 
components and try to re-elect 
(kafka.controller.KafkaController$SessionExpirationListener)

I also see an increase in minor-GC collection around the same time.

2015-12-17T15:57:38.516+0000: 8166.220: [GC2015-12-17T15:57:38.516+0000: 
8166.220: [ParNew: 283592K->4176K(314560K), 0.0081650 secs] 
603757K->324456K(1013632K), 5.7134120 secs] [Times: user=0.05 sys=0.00, 
real=5.71 secs]

Here's a snippet of the broker log around that time

[2015-12-17 22:00:36,090] INFO zookeeper state changed (SyncConnected) 
(org.I0Itec.zkclient.ZkClient)
15754934 [main-SendThread(kfk02.local:2182)] INFO 
org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard from 
server in 12203ms for sessionid 0x151b10503e60002, closing socket connection 
and attempting reconnect
[2015-12-17 22:01:55,533] INFO zookeeper state changed (Disconnected) 
(org.I0Itec.zkclient.ZkClient)
15755399 [main-SendThread(kfk01.local:2182)] INFO 
org.apache.zookeeper.ClientCnxn - Opening socket connection to server 
kfk01.local/10.124.80.140:2182. Will not attempt to authenticate using SASL 
(unknown error)
15755400 [main-SendThread(kfk01.local:2182)] INFO 
org.apache.zookeeper.ClientCnxn - Socket connection established to 
kfk01.local/10.124.80.140:2182, initiating session
15755401 [main-SendThread(kfk01.local:2182)] INFO 
org.apache.zookeeper.ClientCnxn - Session establishment complete on server 
kfk01.local/10.124.80.140:2182, sessionid = 0x151b10503e60002, negotiated 
timeout = 12000
[2015-12-17 22:01:55,902] INFO zookeeper state changed (SyncConnected) 
(org.I0Itec.zkclient.ZkClient)

Any idea what may be causing this?


Thanks!


> Controller failing over repeatadly
> ----------------------------------
>
>                 Key: KAFKA-3004
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3004
>             Project: Kafka
>          Issue Type: Bug
>          Components: controller
>    Affects Versions: 0.8.2.0
>         Environment: Centos 6.5
> OpenJDK 1.7.0_79
> 6 Kafka nodes
> 3 ZK nodes (cluster mode)
>            Reporter: Sadek
>            Assignee: Neha Narkhede
>
> While doing load testing we have noticed that the controller will fail over 
> almost every hour with the following entry on its log:
> INFO [SessionExpirationListener on 4], ZK expired; shut down all controller 
> components and try to re-elect 
> (kafka.controller.KafkaController$SessionExpirationListener)
> I also see an increase in minor-GC collection around the same time.
> 2015-12-17T22:00:40.961+0000: 15693.112: [GC2015-12-17T22:00:46.404+0000: 
> 15698.554: [ParNew: 282865K->3922K(314560K), 0.0104700 secs] 
> 576345K->297570K(1013632K), 5.4531250 secs] [Times: user=0.05 sys=0.00, 
> real=5.46 secs]
> Here's a snippet of the broker log around that time
> [2015-12-17 22:00:36,090] INFO zookeeper state changed (SyncConnected) 
> (org.I0Itec.zkclient.ZkClient)
> 15754934 [main-SendThread(kfk02.local:2182)] INFO 
> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard 
> from server in 12203ms for sessionid 0x151b10503e60002, closing socket 
> connection and attempting reconnect
> [2015-12-17 22:01:55,533] INFO zookeeper state changed (Disconnected) 
> (org.I0Itec.zkclient.ZkClient)
> 15755399 [main-SendThread(kfk01.local:2182)] INFO 
> org.apache.zookeeper.ClientCnxn - Opening socket connection to server 
> kfk01.local/10.124.80.140:2182. Will not attempt to authenticate using SASL 
> (unknown error)
> 15755400 [main-SendThread(kfk01.local:2182)] INFO 
> org.apache.zookeeper.ClientCnxn - Socket connection established to 
> kfk01.local/10.124.80.140:2182, initiating session
> 15755401 [main-SendThread(kfk01.local:2182)] INFO 
> org.apache.zookeeper.ClientCnxn - Session establishment complete on server 
> kfk01.local/10.124.80.140:2182, sessionid = 0x151b10503e60002, negotiated 
> timeout = 12000
> [2015-12-17 22:01:55,902] INFO zookeeper state changed (SyncConnected) 
> (org.I0Itec.zkclient.ZkClient)
> Any idea what may be causing this?
> Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to