[
https://issues.apache.org/jira/browse/KAFKA-3834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15350268#comment-15350268
]
Peter Davis commented on KAFKA-3834:
------------------------------------
I believe we have seen this issue IRL when the new coordinator takes a long
time to become available after an election. This can happen if log compaction
has halted (for example due to too-small I/O buffer), then __consumer_offsets
will grow ridiculously large; in one instance it was taking the coordinators
several minutes to come online before we realized the problem. Meanwhile,
poll() would spin and log red-herring errors every 100ms.
This also occurs on commitSync(), which I believe does a poll() internally, but
also has a "while" loop of its own. Should improving blocking of commitSync()
be a separate JIRA?
> Consumer should not block in poll on coordinator discovery
> ----------------------------------------------------------
>
> Key: KAFKA-3834
> URL: https://issues.apache.org/jira/browse/KAFKA-3834
> Project: Kafka
> Issue Type: Improvement
> Components: consumer
> Reporter: Jason Gustafson
> Assignee: Jason Gustafson
>
> Currently we block indefinitely in poll() when discovering the coordinator
> for the group. Instead, we can return an empty record set when the passed
> timeout expires. The downside is that it may obscure the underlying problem
> (which is usually misconfiguration), but users typically have to look at the
> logs to figure out the problem anyway.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)