[ 
https://issues.apache.org/jira/browse/KAFKA-5016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15979482#comment-15979482
 ] 

Vahid Hashemian commented on KAFKA-5016:
----------------------------------------

[~domenico74] I think you are seeing the difference in behavior because of 
[KIP-62|https://cwiki.apache.org/confluence/display/KAFKA/KIP-62%3A+Allow+consumer+to+send+heartbeats+from+a+background+thread],
 which was implemented sometime between 0.10.0.0 and 0.10.1.0 releases. The 
config {{max.poll.interval.ms}} that you used in your sample code was actually 
introduced in this KIP, and would not have any impact when the code runs with 
an older client. This config specifies the max amount of time to wait before a 
new {{poll}} is expected of a consumer; or it will leave the group. In older 
versions, {{session.timeout.ms}} used to specify this timeout, which defaulted 
to 30 seconds. In fact if you run the code with a 0.10.0.0 client the second 
consumer's {{read}} takes 30 seconds to finish (time out) and leave the group. 
Setting {{max.poll.interval.ms}} to max int is like making the group wait 
forever for subsequent {{poll}}s from consumers (hence the hang). So to me what 
you are seeing with the configuration you used is the expected behavior. Please 
advise if I'm missing something. Thanks.

> Consumer hang in poll method while rebalancing is in progress
> -------------------------------------------------------------
>
>                 Key: KAFKA-5016
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5016
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.10.1.0, 0.10.2.0
>            Reporter: Domenico Di Giulio
>            Assignee: Vahid Hashemian
>         Attachments: Kafka 0.10.2.0 Issue (TRACE) - Server + Client.txt, 
> Kafka 0.10.2.0 Issue (TRACE).txt, KAFKA_5016.java
>
>
> After moving to Kafka 0.10.2.0, it looks like I'm experiencing a hang in the 
> rebalancing code. 
> This is a test case, not (still) production code. It does the following with 
> a single-partition topic and two consumers in the same group:
> 1) a topic with one partition is forced to be created (auto-created)
> 2) a producer is used to write 10 messages
> 3) the first consumer reads all the messages and commits
> 4) the second consumer attempts a poll() and hangs indefinitely
> The same issue can't be found with 0.10.0.0.
> See the attached logs at TRACE level. Look for "SERVER HANGS" to see where 
> the hang is found: when this happens, the client keeps failing any hearbeat 
> attempt, as the rebalancing is in progress, and the poll method hangs 
> indefinitely.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to