[ https://issues.apache.org/jira/browse/KAFKA-2331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15786920#comment-15786920 ]
Ewen Cheslack-Postava commented on KAFKA-2331: ---------------------------------------------- This seems like a separate issue -- it sounds like the consumers are joining the group but not getting assigned any partitions when there is only 1 topic and 10 partitions. Both range and round-robin assignment should have the same behavior in that case. But the split that is shown (3,2,1,1,1,1,1) doesn't seem likely either. It seems more likely that is an artifact of just aggregating all partitions each thread saw messages for without taking into account that when the first couple of instances join there will be a period when they are trying to fetch data for *all* partitions and can validly see data for more than 1 partition. This is quite old and is for the old consumer, but if someone wanted to tackle it, one strategy might be to use a ConsumerRebalanceListener to get more info about how partitions are being assigned. That might also reveal other issues such as some members not successfully joining the group. > Kafka does not spread partitions in a topic among all consumers evenly > ---------------------------------------------------------------------- > > Key: KAFKA-2331 > URL: https://issues.apache.org/jira/browse/KAFKA-2331 > Project: Kafka > Issue Type: Bug > Components: clients, consumer > Affects Versions: 0.8.1.1 > Reporter: Stefan Miklosovic > > I want to have 1 topic with 10 partitions. I am using default configuration > of Kafka. I create 1 topic with 10 partitions by that helper script and now I > am about to produce messages to it. > The thing is that even all partitions are indeed consumed, some consumers > have more then 1 partition assigned even I have number of consumer threads > equal to partitions in a topic hence some threads are idle. > Let's describe it in more detail. > I know that common stuff that you need one consumer thread per partition. I > want to be able to commit offsets per partition and this is possible only > when I have 1 thread per consumer connector per partition (I am using high > level consumer). > So I create 10 threads, in each thread I am calling > Consumer.createJavaConsumerConnector() where I am doing this > topicCountMap.put("mytopic", 1); > and in the end I have 1 iterator which consumes messages from 1 partition. > When I do this 10 times, I have 10 consumers, consumer per thread per > partition where I can commit offsets independently per partition because if I > put different number from 1 in topic map, I would end up with more then 1 > consumer thread for that topic for given consumer instance so if I am about > to commit offsets with created consumer instance, it would commit them for > all threads which is not desired. > But the thing is that when I use consumers, only 7 consumers are involved and > it seems that other consumer threads are idle but I do not know why. > The thing is that I am creating these consumer threads in a loop. So I start > first thread (submit to executor service), then another, then another and so > on. > So the scenario is that first consumer gets all 10 partitions, then 2nd > connects so it is splits between these two to 5 and 5 (or something similar), > then other threads are connecting. > I understand this as a partition rebalancing among all consumers so it > behaves well in such sense that if more consumers are being created, > partition rebalancing occurs between these consumers so every consumer should > have some partitions to operate upon. > But from the results I see that there is only 7 consumers and according to > consumed messages it seems they are split like 3,2,1,1,1,1,1 partition-wise. > Yes, these 7 consumers covered all 10 partitions, but why consumers with more > then 1 partition do no split and give partitions to remaining 3 consumers? > I am pretty much wondering what is happening with remaining 3 threads and why > they do not "grab" partitions from consumers which have more then 1 partition > assigned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)