Java / OS info:
----------
java.specification.version = 1.8
java.vendor = Oracle Corporation
java.version = 1.8.0_45
Oracle Linux Server release 6.7
kernel version 2.6.32-573.18.1.el6.x86_64

Redacted LSOF
---------------------

~46K Close Waits
------------------
java    4692 kafka 2618u  IPv6          264581081       0t0       TCP
XX-XXXX-kafka01:XmlIpcRegSvc->XX-XXXX-host1:33089 (CLOSE_WAIT)
java    4692 kafka 2619u  IPv6          264581082       0t0       TCP
XX-XXXX-kafka01:XmlIpcRegSvc->XX-XXXX-host2:37371 (CLOSE_WAIT)
java    4692 kafka 2621u  IPv6          264600187       0t0       TCP
XX-XXXX-kafka01:XmlIpcRegSvc->XX-XXXX-host3:40788 (CLOSE_WAIT)


475 Established connections
----------------------------
java    4692 kafka *427u  IPv6          282382725       0t0       TCP
XX-XXXX-kafka01:54099->XX-XXXX-host1:eforward (ESTABLISHED)
java    4692 kafka *639u  IPv6          282426735       0t0       TCP
XX-XXXX-kafka01:36157->XX-XXXX-kafka01:59964 (ESTABLISHED)
java    4692 kafka *860u  IPv6          282480072       0t0       TCP
XX-XXXX-kafka01:XmlIpcRegSvc->XX-XXXX-host2:50547 (ESTABLISHED)
java    4692 kafka *507u  IPv6          282481853       0t0       TCP
XX-XXXX-kafka01:XmlIpcRegSvc->XX-XXXX-host3:45096 (ESTABLISHED)

~3K
----------------------------
java    4692 kafka 2367u   REG              253,3 104857335 141033710
/XXX/kafka/LOG/__consumer_offsets-10/00000000000035177234.log

~1.5K
----------------------------
java    4692 kafka  mem    REG              253,3  10485760 141297356
/XXX/kafka/LOG/TOPIC-1-9/00000000000000028243.index

~1.5K
----------------------------
java    4692 kafka  818u   REG              253,3   2548089 141297556
/XXX/kafka/LOG/TOPIC-1-2-76/00000000000000146894.log
java    4692 kafka  819u   REG              253,3         0 141165545
/XXX/kafka/LOG/TOPIC-2-2-11/00000000000000000000.log



On Fri, Aug 26, 2016 at 6:37 AM, Jaikiran Pai <jai.forums2...@gmail.com>
wrote:

> Which Java vendor and version are you using in runtime? Also what OS is
> this? Can you get the lsof output (on Linux) and paste the output of that
> to some place (like gist) to show us what descriptors are open etc...
>
> -Jaikiran
>
>
> On Friday 26 August 2016 02:49 AM, Bharath Srinivasan wrote:
>
>> Hello:
>>
>> We are running a data pipeline application stack using Kafka 0.8.2.2 in
>> production. We have been seeing intermittent CLOSE_WAIT on our kafka
>> brokers frequently and they fill up the file handles pretty quickly. By
>> the
>> time the open file count reaches around 40K, the node becomes unresponsive
>> and we see huge GC pauses. The only way out has been restart of the node.
>> When the nodes are working fine, the average open files in the nodes stay
>> around 6K during peak load and 3K at average.
>>
>> Configurations:
>> - 5 broker cluster (Single node spec: 24 core processors, 250 GB RAM,
>> 256GB
>> SSD)
>> - 20 topics and 1100 partitions across all topics
>> - Replication factor of 3
>> - Java based KafkaProducer and high level consumers
>> (ZookeeperConsumerConnector)
>> - GC params { -Xmx32G -Xms4G -server -XX:MetaspaceSize=96m -XX:+UseG1GC
>> -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35
>> -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=50
>> -XX:MaxMetaspaceFreeRatio=80 }
>>
>> Any pointers here? Appreciate your help.
>>
>> Thanks,
>> Bharath
>>
>>
>

Reply via email to