Sean Humbarger created KAFKA-8103: ------------------------------------- Summary: Kafka SIGSEGV on kafka-network-thread Key: KAFKA-8103 URL: https://issues.apache.org/jira/browse/KAFKA-8103 Project: Kafka Issue Type: Bug Affects Versions: 1.1.1 Environment: OS {code} Amazon Linux {code}
Kernel {code} 4.14.97-74.72.amzn1.x86_64 #1 SMP Tue Feb 5 20:59:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux {code} Java {code} openjdk version "1.8.0_191" OpenJDK Runtime Environment (build 1.8.0_191-b12) OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode) {code} AWS Instance Type {code} c5.4xlarge {code} Reporter: Sean Humbarger Attachments: hs_err_pid4345.log We have a 4 node cluster (6 topics, 6 consumer groups) that is processing 65,000 messages per second and are seeing SIGSEGV crashes at least once a day (see attachment). Each broker has six disks attached to it to support the kafka logs. When the crash occurs, we simply restart kafka and everything seems fine. We don't see any out of the ordinary in /var/log/messages or dmesg when the crashes occur. Thus far, we are unable to predict during the day when the crash will occur or which node it will occur on. The problematic frame is as follows: {code} # Problematic frame: # J 8628 C2 org.apache.kafka.common.metrics.stats.Max.update(Lorg/apache/kafka/common/metrics/stats/SampledStat$Sample;Lorg/apache/kafka/common/metrics/MetricConfig;DJ)V (13 bytes) @ 0x00007ff779f9fca0 [0x00007ff779f9fc80+0x20] {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)