[ https://issues.apache.org/jira/browse/KAFKA-3834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15350268#comment-15350268 ]
Peter Davis commented on KAFKA-3834: ------------------------------------ I believe we have seen this issue IRL when the new coordinator takes a long time to become available after an election. This can happen if log compaction has halted (for example due to too-small I/O buffer), then __consumer_offsets will grow ridiculously large; in one instance it was taking the coordinators several minutes to come online before we realized the problem. Meanwhile, poll() would spin and log red-herring errors every 100ms. This also occurs on commitSync(), which I believe does a poll() internally, but also has a "while" loop of its own. Should improving blocking of commitSync() be a separate JIRA? > Consumer should not block in poll on coordinator discovery > ---------------------------------------------------------- > > Key: KAFKA-3834 > URL: https://issues.apache.org/jira/browse/KAFKA-3834 > Project: Kafka > Issue Type: Improvement > Components: consumer > Reporter: Jason Gustafson > Assignee: Jason Gustafson > > Currently we block indefinitely in poll() when discovering the coordinator > for the group. Instead, we can return an empty record set when the passed > timeout expires. The downside is that it may obscure the underlying problem > (which is usually misconfiguration), but users typically have to look at the > logs to figure out the problem anyway. -- This message was sent by Atlassian JIRA (v6.3.4#6332)