[ 
https://issues.apache.org/jira/browse/KAFKA-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14360892#comment-14360892
 ] 

Joseph Holsten commented on KAFKA-2019:
---------------------------------------

Exactly!

Essentially, if you've got 2 consumers each running 2 threads, and a topic with 
6 partitions, you'll end up with:
- consumer-1: 4 partitions
- consumer-2: 2 partitions

This kind of imbalance is expected for tiny nodes. But if you've got 2 beefy 
consumers each running 16 threads, and a topic with 48 partitions, you'll end 
up with:
- consumer-1: 32 partitions
- consumer-2: 16 partitions

Same difference, except this patch should be able to remedy the more extreme 
cases.

> RoundRobinAssignor clusters by consumer
> ---------------------------------------
>
>                 Key: KAFKA-2019
>                 URL: https://issues.apache.org/jira/browse/KAFKA-2019
>             Project: Kafka
>          Issue Type: Bug
>          Components: consumer
>            Reporter: Joseph Holsten
>            Assignee: Neha Narkhede
>            Priority: Minor
>         Attachments: 0001-sort-consumer-thread-ids-by-hashcode.patch
>
>
> When rolling out a change today, I noticed that some of my consumers are 
> "greedy", taking far more partitions than others.
> The cause is that the RoundRobinAssignor is using a list of ConsumerThreadIds 
> sorted by toString, which is {{ "%s-%d".format(consumer, threadId)}}. This 
> causes each consumer's threads to be adjacent to each other.
> One possible fix would be to define ConsumerThreadId.hashCode, and sort by 
> that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to