[ 
https://issues.apache.org/jira/browse/KAFKA-7349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-7349.
-----------------------------------
    Resolution: Won't Fix

We are now removing ZooKeeper support so closing this issue.

> Long Disk Writes cause Zookeeper Disconnects
> --------------------------------------------
>
>                 Key: KAFKA-7349
>                 URL: https://issues.apache.org/jira/browse/KAFKA-7349
>             Project: Kafka
>          Issue Type: Bug
>          Components: zkclient
>    Affects Versions: 0.11.0.1
>            Reporter: Adam Kafka
>            Priority: Minor
>         Attachments: SpikeInWriteTime.png
>
>
> We run our Kafka cluster on a cloud service provider. As a consequence, we 
> notice a large tail latency write time that is out of our control. Some 
> writes take on the order of seconds. We have noticed that often these long 
> write times are correlated with subsequent Zookeeper disconnects from the 
> brokers. It appears that during the long write time, the Zookeeper heartbeat 
> thread does not get scheduled CPU time, resulting in a long gap of heartbeats 
> sent. After the write, the ZK thread does get scheduled CPU time, but it 
> detects that it has not received a heartbeat from Zookeeper in a while, so it 
> drops its connection then rejoins the cluster.
> Note that the timeout reported is inconsistent with the timeout as set by the 
> configuration ({{zookeeper.session.timeout.ms}} = default of 6 seconds). We 
> have seen a range of values reported here, including 5950ms (less than 
> threshold), 12032ms (double the threshold), 25999ms (much larger than the 
> threshold).
> We noticed that during a service degradation of the storage service of our 
> cloud provider, these Zookeeper disconnects increased drastically in 
> frequency. 
> We are hoping there is a way to decouple these components. Do you agree with 
> our diagnosis that the ZK disconnects are occurring due to thread contention 
> caused by long disk writes? Perhaps the ZK thread could be scheduled at a 
> higher priority? Do you have any suggestions for how to avoid the ZK 
> disconnects?
> Here is an example of one of these events:
> Logs on the Broker:
> {code}
> [2018-08-25 04:10:19,695] DEBUG Got ping response for sessionid: 
> 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn)
> [2018-08-25 04:10:21,697] DEBUG Got ping response for sessionid: 
> 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn)
> [2018-08-25 04:10:23,700] DEBUG Got ping response for sessionid: 
> 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn)
> [2018-08-25 04:10:25,701] DEBUG Got ping response for sessionid: 
> 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn)
> [2018-08-25 04:10:27,702] DEBUG Got ping response for sessionid: 
> 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn)
> [2018-08-25 04:10:29,704] DEBUG Got ping response for sessionid: 
> 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn)
> [2018-08-25 04:10:31,707] DEBUG Got ping response for sessionid: 
> 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn)
> [2018-08-25 04:10:33,709] DEBUG Got ping response for sessionid: 
> 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn)
> [2018-08-25 04:10:35,712] DEBUG Got ping response for sessionid: 
> 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn)
> [2018-08-25 04:10:37,714] DEBUG Got ping response for sessionid: 
> 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn)
> [2018-08-25 04:10:39,716] DEBUG Got ping response for sessionid: 
> 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn)
> [2018-08-25 04:10:41,719] DEBUG Got ping response for sessionid: 
> 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn)
> ...
> [2018-08-25 04:10:53,752] WARN Client session timed out, have not heard from 
> server in 12032ms for sessionid 0x36202ab4337002c 
> (org.apache.zookeeper.ClientCnxn)
> [2018-08-25 04:10:53,754] INFO Client session timed out, have not heard from 
> server in 12032ms for sessionid 0x36202ab4337002c, closing socket connection 
> and attempting reconnect (org.apache.zookeeper.ClientCnxn)
> [2018-08-25 04:10:53,920] INFO zookeeper state changed (Disconnected) 
> (org.I0Itec.zkclient.ZkClient)
> [2018-08-25 04:10:53,920] INFO Waiting for keeper state SyncConnected 
> (org.I0Itec.zkclient.ZkClient)
> ...
> {code}
> GC logs during the same time (demonstrating this is not just a long GC):
> {code}
> 2018-08-25T04:10:36.434+0000: 35150.779: [GC (Allocation Failure)  
> 3074119K->2529089K(6223360K), 0.0137342 secs]
> 2018-08-25T04:10:37.367+0000: 35151.713: [GC (Allocation Failure)  
> 3074433K->2528524K(6223360K), 0.0127938 secs]
> 2018-08-25T04:10:38.274+0000: 35152.620: [GC (Allocation Failure)  
> 3073868K->2528357K(6223360K), 0.0131040 secs]
> 2018-08-25T04:10:39.220+0000: 35153.566: [GC (Allocation Failure)  
> 3073701K->2528885K(6223360K), 0.0133247 secs]
> 2018-08-25T04:10:40.175+0000: 35154.520: [GC (Allocation Failure)  
> 3074229K->2528639K(6223360K), 0.0127870 secs]
> 2018-08-25T04:10:41.084+0000: 35155.429: [GC (Allocation Failure)  
> 3073983K->2530769K(6223360K), 0.0135058 secs]
> 2018-08-25T04:10:42.012+0000: 35156.358: [GC (Allocation Failure)  
> 3076113K->2531772K(6223360K), 0.0165919 secs]
> 2018-08-25T04:10:53.737+0000: 35168.083: [GC (Allocation Failure)  
> 3077116K->2529630K(6223360K), 0.0135440 secs]
> 2018-08-25T04:10:58.515+0000: 35172.860: [GC (Allocation Failure)  
> 3074974K->2531144K(6223360K), 0.0164601 secs]
> 2018-08-25T04:11:03.149+0000: 35177.495: [GC (Allocation Failure)  
> 3076488K->2531751K(6223360K), 0.0146978 secs]
> {code}
> The attached screenshot shows a spike in write time to our data volume at the 
> same second. Note that this an average value, so it does not represent the 
> max write time, only a sample of write times. 
>  !SpikeInWriteTime.png! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to