[ https://issues.apache.org/jira/browse/KAFKA-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14360892#comment-14360892 ]
Joseph Holsten commented on KAFKA-2019: --------------------------------------- Exactly! Essentially, if you've got 2 consumers each running 2 threads, and a topic with 6 partitions, you'll end up with: - consumer-1: 4 partitions - consumer-2: 2 partitions This kind of imbalance is expected for tiny nodes. But if you've got 2 beefy consumers each running 16 threads, and a topic with 48 partitions, you'll end up with: - consumer-1: 32 partitions - consumer-2: 16 partitions Same difference, except this patch should be able to remedy the more extreme cases. > RoundRobinAssignor clusters by consumer > --------------------------------------- > > Key: KAFKA-2019 > URL: https://issues.apache.org/jira/browse/KAFKA-2019 > Project: Kafka > Issue Type: Bug > Components: consumer > Reporter: Joseph Holsten > Assignee: Neha Narkhede > Priority: Minor > Attachments: 0001-sort-consumer-thread-ids-by-hashcode.patch > > > When rolling out a change today, I noticed that some of my consumers are > "greedy", taking far more partitions than others. > The cause is that the RoundRobinAssignor is using a list of ConsumerThreadIds > sorted by toString, which is {{ "%s-%d".format(consumer, threadId)}}. This > causes each consumer's threads to be adjacent to each other. > One possible fix would be to define ConsumerThreadId.hashCode, and sort by > that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)