[ https://issues.apache.org/jira/browse/KAFKA-2331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15766509#comment-15766509 ]
Jeff Widman commented on KAFKA-2331: ------------------------------------ Isn't this what round robin partitioning strategy was trying to solve? If so, this issue should be closed. > Kafka does not spread partitions in a topic among all consumers evenly > ---------------------------------------------------------------------- > > Key: KAFKA-2331 > URL: https://issues.apache.org/jira/browse/KAFKA-2331 > Project: Kafka > Issue Type: Bug > Components: core > Affects Versions: 0.8.1.1 > Reporter: Stefan Miklosovic > > I want to have 1 topic with 10 partitions. I am using default configuration > of Kafka. I create 1 topic with 10 partitions by that helper script and now I > am about to produce messages to it. > The thing is that even all partitions are indeed consumed, some consumers > have more then 1 partition assigned even I have number of consumer threads > equal to partitions in a topic hence some threads are idle. > Let's describe it in more detail. > I know that common stuff that you need one consumer thread per partition. I > want to be able to commit offsets per partition and this is possible only > when I have 1 thread per consumer connector per partition (I am using high > level consumer). > So I create 10 threads, in each thread I am calling > Consumer.createJavaConsumerConnector() where I am doing this > topicCountMap.put("mytopic", 1); > and in the end I have 1 iterator which consumes messages from 1 partition. > When I do this 10 times, I have 10 consumers, consumer per thread per > partition where I can commit offsets independently per partition because if I > put different number from 1 in topic map, I would end up with more then 1 > consumer thread for that topic for given consumer instance so if I am about > to commit offsets with created consumer instance, it would commit them for > all threads which is not desired. > But the thing is that when I use consumers, only 7 consumers are involved and > it seems that other consumer threads are idle but I do not know why. > The thing is that I am creating these consumer threads in a loop. So I start > first thread (submit to executor service), then another, then another and so > on. > So the scenario is that first consumer gets all 10 partitions, then 2nd > connects so it is splits between these two to 5 and 5 (or something similar), > then other threads are connecting. > I understand this as a partition rebalancing among all consumers so it > behaves well in such sense that if more consumers are being created, > partition rebalancing occurs between these consumers so every consumer should > have some partitions to operate upon. > But from the results I see that there is only 7 consumers and according to > consumed messages it seems they are split like 3,2,1,1,1,1,1 partition-wise. > Yes, these 7 consumers covered all 10 partitions, but why consumers with more > then 1 partition do no split and give partitions to remaining 3 consumers? > I am pretty much wondering what is happening with remaining 3 threads and why > they do not "grab" partitions from consumers which have more then 1 partition > assigned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)