[ https://issues.apache.org/jira/browse/KAFKA-4478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sinóros-Szabó Péter updated KAFKA-4478: --------------------------------------- Attachment: kafka-1.state-change.log kafka-1.server.log.2016-12-01-21 kafka-1.jstack broker-0.server.log.2016-12-01-21 broker-0.controller.log.2016-12-01-21 uploading relevant stack trace and logs file. broker-2 also shows similar logs to broker-0 > Deadlock between heartbeat executor, group metadata manager and request > handler > ------------------------------------------------------------------------------- > > Key: KAFKA-4478 > URL: https://issues.apache.org/jira/browse/KAFKA-4478 > Project: Kafka > Issue Type: Bug > Affects Versions: 0.10.1.0 > Reporter: Sinóros-Szabó Péter > Labels: reliability > Attachments: broker-0.controller.log.2016-12-01-21, > broker-0.server.log.2016-12-01-21, kafka-1.jstack, > kafka-1.server.log.2016-12-01-21, kafka-1.state-change.log > > > We are running a 0.10.1.0 cluster with 3 brokers with ids 0, 1 and 2. > At about 2016-12-01 21:29 something happened with broker 1 and since then I > see {{java.io.IOException: Connection to 1 was disconnected before the > response was read}} errors in the logs of broker 0 and 2. Clients were unable > to produce to broker 1's partitions and JMX counters indicates > underreplicated partitions. > I took a stack trace on broker-1 and I see that there is a deadlock between > the JVM threads: > Found one Java-level deadlock: > ============================= > "executor-Heartbeat": > waiting to lock monitor 0x00007ffa24029df8 (object 0x00000000cc52fe70, a > kafka.coordinator.GroupMetadata), > which is held by "group-metadata-manager-0" > "group-metadata-manager-0": > waiting to lock monitor 0x00007ff9900a83a8 (object 0x00000000ca8b0820, a > java.util.LinkedList), > which is held by "kafka-request-handler-7" > "kafka-request-handler-7": > waiting to lock monitor 0x00007ffa24029df8 (object 0x00000000cc52fe70, a > kafka.coordinator.GroupMetadata), > which is held by "group-metadata-manager-0" -- This message was sent by Atlassian JIRA (v6.3.4#6332)