Ryan Leslie created KAFKA-6479: ---------------------------------- Summary: Broker file descriptor leak after consumer request timeout Key: KAFKA-6479 URL: https://issues.apache.org/jira/browse/KAFKA-6479 Project: Kafka Issue Type: Bug Components: controller Affects Versions: 1.0.0 Reporter: Ryan Leslie
When a consumer request times out, i.e. takes longer than request.timeout.ms, and the client disconnects from the coordinator, the coordinator may leak file descriptors. The following code produces this behavior: {code:java} Properties config = new Properties(); config.put("bootstrap.servers", BROKERS); config.put("group.id", "leak-test"); config.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer"); config.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer"); config.put("max.poll.interval.ms", Integer.MAX_VALUE); config.put("request.timeout.ms", 12000); KafkaConsumer<String, String> consumer1 = new KafkaConsumer<>(config); KafkaConsumer<String, String> consumer2 = new KafkaConsumer<>(config); List<String> topics = Collections.singletonList("leak-test"); consumer1.subscribe(topics); consumer2.subscribe(topics); consumer1.poll(100); consumer2.poll(100); {code} When the above executes, consumer 2 will attempt to rebalance indefinitely (blocked by the inactive consumer 1), logging a _Marking the coordinator dead_ message every 12 seconds after giving up on the JOIN_GROUP request and disconnecting. Unless the consumer exits or times out, this will cause a socket in CLOSE_WAIT to leak in the coordinator and the broker will eventually run out of file descriptors and crash. Aside from faulty code as in the example above, or an intentional DoS, any client bug causing a consumer to block, e.g. KAFKA-6397, could also result in this leak. -- This message was sent by Atlassian JIRA (v7.6.3#76005)