[ https://issues.apache.org/jira/browse/KAFKA-5016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15979482#comment-15979482 ]
Vahid Hashemian commented on KAFKA-5016: ---------------------------------------- [~domenico74] I think you are seeing the difference in behavior because of [KIP-62|https://cwiki.apache.org/confluence/display/KAFKA/KIP-62%3A+Allow+consumer+to+send+heartbeats+from+a+background+thread], which was implemented sometime between 0.10.0.0 and 0.10.1.0 releases. The config {{max.poll.interval.ms}} that you used in your sample code was actually introduced in this KIP, and would not have any impact when the code runs with an older client. This config specifies the max amount of time to wait before a new {{poll}} is expected of a consumer; or it will leave the group. In older versions, {{session.timeout.ms}} used to specify this timeout, which defaulted to 30 seconds. In fact if you run the code with a 0.10.0.0 client the second consumer's {{read}} takes 30 seconds to finish (time out) and leave the group. Setting {{max.poll.interval.ms}} to max int is like making the group wait forever for subsequent {{poll}}s from consumers (hence the hang). So to me what you are seeing with the configuration you used is the expected behavior. Please advise if I'm missing something. Thanks. > Consumer hang in poll method while rebalancing is in progress > ------------------------------------------------------------- > > Key: KAFKA-5016 > URL: https://issues.apache.org/jira/browse/KAFKA-5016 > Project: Kafka > Issue Type: Bug > Components: core > Affects Versions: 0.10.1.0, 0.10.2.0 > Reporter: Domenico Di Giulio > Assignee: Vahid Hashemian > Attachments: Kafka 0.10.2.0 Issue (TRACE) - Server + Client.txt, > Kafka 0.10.2.0 Issue (TRACE).txt, KAFKA_5016.java > > > After moving to Kafka 0.10.2.0, it looks like I'm experiencing a hang in the > rebalancing code. > This is a test case, not (still) production code. It does the following with > a single-partition topic and two consumers in the same group: > 1) a topic with one partition is forced to be created (auto-created) > 2) a producer is used to write 10 messages > 3) the first consumer reads all the messages and commits > 4) the second consumer attempts a poll() and hangs indefinitely > The same issue can't be found with 0.10.0.0. > See the attached logs at TRACE level. Look for "SERVER HANGS" to see where > the hang is found: when this happens, the client keeps failing any hearbeat > attempt, as the rebalancing is in progress, and the poll method hangs > indefinitely. -- This message was sent by Atlassian JIRA (v6.3.15#6346)