[
https://issues.apache.org/jira/browse/KAFKA-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14360892#comment-14360892
]
Joseph Holsten commented on KAFKA-2019:
---------------------------------------
Exactly!
Essentially, if you've got 2 consumers each running 2 threads, and a topic with
6 partitions, you'll end up with:
- consumer-1: 4 partitions
- consumer-2: 2 partitions
This kind of imbalance is expected for tiny nodes. But if you've got 2 beefy
consumers each running 16 threads, and a topic with 48 partitions, you'll end
up with:
- consumer-1: 32 partitions
- consumer-2: 16 partitions
Same difference, except this patch should be able to remedy the more extreme
cases.
> RoundRobinAssignor clusters by consumer
> ---------------------------------------
>
> Key: KAFKA-2019
> URL: https://issues.apache.org/jira/browse/KAFKA-2019
> Project: Kafka
> Issue Type: Bug
> Components: consumer
> Reporter: Joseph Holsten
> Assignee: Neha Narkhede
> Priority: Minor
> Attachments: 0001-sort-consumer-thread-ids-by-hashcode.patch
>
>
> When rolling out a change today, I noticed that some of my consumers are
> "greedy", taking far more partitions than others.
> The cause is that the RoundRobinAssignor is using a list of ConsumerThreadIds
> sorted by toString, which is {{ "%s-%d".format(consumer, threadId)}}. This
> causes each consumer's threads to be adjacent to each other.
> One possible fix would be to define ConsumerThreadId.hashCode, and sort by
> that.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)